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Breakthrough in presentation of simulation results 
SIMSCRIPT II.5 with SIMANIMATION 
Now you see an animated picture of the system 


Free trial and training 

See for yourself how simulation 
results are now easier to understand. 

The free trial contains everything 
you need to try SIMANIMATION® 
on your computer. 

We send you SIMSCRIPT II.5, 
animated models, and complete 
documentation. You can build your 
own model or modify one of ours. 

Try the SIMSCRIPT II.5 
language, the timeliness of our sup¬ 
port, the accuracy of our documen¬ 
tation, and the facilities for error 
checking-everything you need for a 
successful project. 

No cost, no obligation. 

Act now for free training 

For a limited time we will also in¬ 
clude free training. 

For immediate information 
Call Hal Duncan at (619) 
457-9681. In the UK call Richard 
Transportation network Telecommunications-COMNET II.5 Eve on (01) 940-3606. 
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Taking The Lead In Applied R&D 
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"If there's a better way, 
we'll find it." 




Dr. Alan Salisbury, Director 
Cornel Technology Center 

A Diversified Company, 
With One Vision. 

At Contel Corporation, finding 
the better way is our goal. It's also 
our business. And our efforts have 
resulted in over 25 years of leader¬ 
ship in the telecommunications 
and information services industry. 
As one of the youngest companies 
ever admitted to the New York 
Stock Exchange, we've grown 
through hundreds of acquisitions 
to become a $3-billion, Forbes 500 
corporation. We also rank as the 
third-largest independent tele¬ 
phone company in the U.S. It's 
exciting to be out in front. 


And Ready To 
Take The Lead. 

That's why we've taken the initia¬ 
tive with a brand new Technology 
Center responsible for the transfer 
of Applied R&D technology to 
products and services used by 
all of our operating companies 
and customers. We'll be turning 
emerging technologies into useful 
solutions. And bridging the gap 
between research and application. 

With Imaginations 
At Work. 

That's what you’ll find here. 

The best minds in the business. 
Leading Scientists and Engineers 
creating innovative and practical 
solutions for better telecom¬ 
munications and information 
systems. Those who appreciate the 
freedom to design their own proj¬ 
ects. And influence our future 


direction. If you're looking to turn 
your ideas into realities, you can 
do it here. We're building four ini¬ 
tial laboratories, yet, we offer the 
opportunity and challenge of 
crossing technology lines. 

Artificial Intelligence/ 
Man-Machine Interface 
Laboratory 

This laboratory 
is involved in 
the application 
of AI and Expert 
System tech¬ 
niques to the 
broad class of 
telecommunications and informa¬ 
tion systems problems, and 
pursues improved man-machine 
interfaces for these systems as 
well. Typical project areas include 
speech recognition, large-screen 
displays and optimal workstation 
configurations. 













OLOGY ; CENTER 


Software Engineering 
Laboratory 

Software is at the 
heart of not only 
information sys¬ 
tems, but also 
telecommunica¬ 
tions systems. 
The Software 
Engineering Laboratory is develop¬ 
ing improved tools and techniques 
for enhancing productivity in the 
software development and mainte¬ 
nance process. Areas of interest 
include information engineering, 
data base architectures, rapid 
prototyping and Computer Aided 
Software Engineering (CASE) tools 
and methodologies. 



Transmission & 
Switching Systems 
Laboratory 

The focus here 
is on advanced 
technologies for 
I improved tele¬ 
communications 
j systems. Very Small Aperture 
Satellite Terminals (VSAT), fiber 
[ optic transmission systems, surviv- 



able communications, Integrated 
Services Digital Network (ISDN) 
and fast packet switching are rep¬ 
resentative project areas, as are 
message and signal processing. 

Networks & 

Secure Systems 
Laboratory 

Local and wide 
area distribution 
systems for both 
unclassified and 
classified applica¬ 
tions play a major role in many 
Contel systems. This laboratory is 
pursuing improved techniques for 
designing and implementing LANs 
and WANs with an emphasis on 
security and reliability. Research 
interests concern areas such as 
network protocols and topologies, 
network management and control, 
distributed systems, and simulation 
and modeling. 

You Won't Find A 
Better Challenge. 

Or Commitment. 

We're so serious about our tech¬ 
nology, we're building a new 




state-of-the-art facility for the 
Contel Technology Center in the 
Wohlstetter Technology Park in 
Fairfax County, Virginia. Behind 
this commitment are more than 
$5 billion in assets. And an overall 
company objective to be more 
market-driven, more technology- 
focused. With the Center in place, 
we can attain it. 

Imagine Working Here. 

Close to the Nation's capital, you'll 
find Fairfax County, VA to be an 
upscale community offering afford¬ 
able, quality housing, a shorter 
commute, superior educational 
systems, and a growing economy 
spurred by both commercial and 
residential development. Plus 
plenty of recreational and cultural 
choices. In short, it’s a lifestyle 
you've been looking for. 

Isn’t it time you took the lead? 

You can, with Contel. 

Please send resume, including sal¬ 
ary history, to: Contel Technology 
Center, 12015 Lee Jackson 
Memorial Highway, Fairfax, 

Virginia 22033, (703) 359-7732. 

We are an equal opportunity em¬ 
ployer m/f/h/v. U.S. Citizenship 
required. 
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_ President’s MESSAGE 


Our goals for 1988 

In this month’s column, I want to 
report the results of the Executive Com¬ 
mittee’s January 6-7 planning meeting. 
As stated in my January message, I 
intend to capitalize on the various self- 
assessment studies completed by our 
boards in 1987 and to expand the Com¬ 
puter Society’s role in international 
activities. In addition, the society will 
take the necessary steps to open its 
Asian Office in Tokyo during the first 
part of this year. 

To this end, I gave the following 
charge to Executive Committee mem¬ 
bers prior to the planning meeting: 

Present an overview of accomplish¬ 
ments desired for 1988. These 
should result in (1) implementing 
the principal recommendations 
from the self-assessment reports, 

(2) enhancing international activi¬ 
ties and relations, (3) raising the 
visibility of the Computer Society, 

(4) increasing the society’s presence 
at major conferences, (5) raising 
staff effectiveness, (6) expanding 
the use of appropriate technology 
throughout the society (electronic 
mail, electronic media, etc.), and 
(7) increasing the emphasis on the 
awards program. 

The Executive Committee spent 
nearly two days responding to this 
charge. The discussion following each 
presentation attempted to define the 
requisite actions to be taken by each 
officer. At the end of the meeting, the 
group finalized a list of objectives for 
each board. The associated vice presi¬ 
dents will carry this information to their 
boards for further consideration at 
meetings held during Compcon Spring 
88 in San Francisco. A full activity 
plan, complete with performance mea¬ 
sures and appropriate milestones, will 
be developed by the boards and be 
available for presentation at the Execu¬ 
tive Committee meeting on March 3. 
Some of the major issues may also be 
placed on the agenda for the Board of 
Governor’s caucus the previous 
evening. 

The list of objectives is given below. 



Edward A. Parrish, Jr. 


Global goal: 

• Make the Computer Society the 
No. 1 organization for computing 
professionals worldwide. 

Education Activities Board: 

• Work to develop new programs and 
initiatives. 

• Interact with the Area Activities 
Board to help with development of 
student chapters. 

Area Activities Board: 

• Enhance programs directed at 
students. 

• Serve computer science as well as 
electrical engineering and computer 
engineering students. 

• Develop a comprehensive tutorial 
program. 

Membership and Information Board: 

• Continue promotions and other 
activities to reach membership goal 
of 94,500. 

• Develop generalized promotion 
techniques. 

• Address non-US membership pool. 

• Strengthen and expand information 
activities. 

Publications Board: 

• Expand and enhance CS Press 
activities. 

• Control/contain costs. 


• Exploit electronic media—for 
example, bulletin boards and news 
groups. 

Standards Activities Board: 

• Consolidate activities. 

• Expand international activities. 

• Refine international procedures. 

Conferences and Tutorials Board: 

• Enhance quality of products. 

• Consider experiment with 
individualized, track-oriented con¬ 
ference proceedings. 

• Consider board reorganization. 

Technical Activities Board: 

• Develop quality products (special¬ 
ized publications, electronic news 
groups, etc.) for technical commit¬ 
tee members. 

• Consider implementation of mem¬ 
bership application program. 

Issues for all boards: 

• Quality of programs and activities. 

• Membership promotion and 
retention. 

• Attention to students and new 
graduates. 

• Promotion and enhancement of the 
Computer Society’s image among 
technical professionals. 

• Financial impact and planning. 

• Membership communications. 

Considerations for the Awards Com¬ 
mittee: 

• New, high prestige awards. 

• Awards for student work—select 
best three to five dissertations in 
computer engineering and com¬ 
puter science and publish through 
CS Press. 

• Outstanding Young Computer 
Scientist/Engineer Award. 

At the conclusion of the meetings at 
Compcon Spring 88, each board will 
have established its activity plan for the 
year, along with appropriate milestones 
and success measures. This should help 
provide continuity of purpose from 
meeting to meeting and involve each 
entity in establishing the society’s 
agenda. 

Watch for progress reports in this 
column in the months ahead. 

Edward A. Parrish, Jr. 

Computer Society President 
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COMMUNICATIONS '88 


DIGITAL TECHNOLOGY... 

SPANNING THE UNIVERSE 


June 12-15,1988 at The Wyndham Franklin Plaza Hotel, Philadelphia, Pennsylvania 

ICC '88... an international conference bringing together the world's knowledge in communications 
management and technology in the birthplace of America. 

A comprehensive technical program will be offered complimented by an exciting social program. The 
technical program will featare tours of local industry; a symposium on Open Network Architecture 
ONA); an evening “ Quality" session featuring RBOC and industry executives; an exhibition hall providing 
attendees with an opportunity to mix with exhibitors' representatives; and, of course, a broad “tracked" 
series of technical sessions spotlighting: • Optical technologies • Intelligent networks • Data 
communication techniques • Radio systems • Operations, performance and quality. 

ICC '88 will also offer keynote speakers at a conference banquet and l an IEEE awards 
luncheon, as well as a special Social program highlighted by a A 
Philadelphia-Style Block Party; Tours of Historic Philadelphia JBk 
and the Land of the Pennsylvania Dutch; a Dinner Cruise; 

and a great casino excursion to Atlantic City. oHfl 


For More Information Contact ICC ’88 by phone... 

1-800-ICC88PH (in the continental U.S.) (215) 972-1308 (outside U.S.) 
weekdays between 8 A.M. and 4:30 RM. (E.S.T) or complete the coupon below and mail.. 

CLIP AND MAIL TO 

ICC’88, c/o ATT Network Systems, NAME: - 

1800 John E Kennedy Blvd., Suite 1300, COMPANY: - 

Philadelphia, PA 19103 ADDRESS: - 

D Please send more information and registration for ICC ’88 - 

□ Please send information for ICC '88 exhibitors phone: - 



























C onventional digital computers are 
extremely good at executing 
sequences of instructions that 
have been precisely formulated for them, 
with the “stored program” representing 
the processing steps that need to be done. 
The human brain, on the other hand, per¬ 
forms well at such tasks as vision, speech, 
information retrieval, and complex spatial 
and temporal pattern recognition in the 
presence of noisy and distorted data— 
tasks that are very difficult for sequential 
digital computers to do at all. How does 
the brain accomplish this, given that its 


“processing elements” (neurons) are sig¬ 
nificantly slower than the processing ele¬ 
ments of contemporary supercomputers? 
Neurons, which are electrochemical 
devices, can respond in milliseconds, 
whereas current, off-the-shelf electronic 


technology can switch states in 
nanoseconds. 

Current estimates place the number of 
neurons in the human brain at 10". They 
are organized in a complex, unknown 
interconnection structure, and individual 
neurons may be connected to several thou¬ 
sand other neurons. It is not yet under¬ 
stood how this massively parallel 
interconnected system of neurons (a “bio¬ 
logical neural network”) allows us to 
store, represent, retrieve, and manipulate 
data such as images, smells, sensations, 
and thoughts. We do not know how it 
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represents a person’s face, for example, so 
that merely seeing someone’s eyes allows 
us to formulate a complete image in which 
we recall other important personal 
information—such as the way he or she 
walks. We do not know how we store 
equations, how we manipulate ideas with¬ 
out writing them down, or how we learn to 
speak, see, and hear. Yet, we do all of these 
things amazingly well. 

Once vision and speech, for example, 
are well-enough understand to be reduced 
to algorithmic form, they can be realized 
on conventional digital computers just as 
well as on artificial neural systems. At that 
point, it will be a cost/performance issue 
as to which technology or combination of 
technologies will be employed to realize 
these capabilities. 

Both the literature and the number of 
professional society meetings focusing on 
artificial neural systems are growing at an 
amazing rate. A number of technical dis¬ 
ciplines are involved in the wide variety of 
independent industrial, government, and 
university-based activities and studies. 
Neurobiologists, neurophysiologists, 
mathematicians, physicists, psychologists, 
computer scientists, and engineers are 
studying and formulating theories about 
how computations actually occur in 
nature. 

Some researchers are undertaking these 
studies so as to ultimately understand how 
the brain works and thus closely adhere to 
or attempt to understand the biology 
involved. Others are evolving entirely new 
computation paradigms based on the sim¬ 
ple models that are part of the new the¬ 
ories. Some of the new paradigms are best 
termed “biologically influenced” because 
they involve assumptions that are not bio¬ 
logically accurate. These biologically 
influenced computational paradigms have 
been used to solve difficult optimization 
problems and to implement associative 
memories. 1 Although the field of artifi¬ 
cial neural systems has roots going back 
over 25 years, 1 ' 2 there currently is no con¬ 
sensus of what is important to study or 
how to go about studying it. Some 
researchers are combining conventional 
AI’s symbolic and heuristic approach to 
complex problem solving with the subsym- 
bolic approach, where neural models 
apparently perform well. 3 

This special issue of Computer is enti¬ 
tled “Artificial Neural Systems”—that is, 
“artificial” as opposed to the “biological” 
neural systems appearing in nature. The 
study of artificial neural systems goes 
under the guise of many names in the lit¬ 


erature: neural networks, connectionist 
models, parallel distributed processing 
models, layered self-adaptive systems, and 
self-organizing systems. Terms like neu¬ 
rocomputers, neuromophoric systems, 
netware, and cyberware are being intro¬ 
duced into our technical “jargon.” For a 
good introduction to computing with arti¬ 
ficial neural systems—one that reviews six 
important neural models used for pattern 
classification—see Lippmann. 2 

This issue is exclusively dedicated to a 
selection of interesting and important 
work in several areas of artificial neural 
systems. Because workers in the field are 
exploring many different areas, the articles 
reflect the diversity and robustness of these 
interests. Some of the authors closely 
adhere to the biology involved, while 
others are developing biologically 
influenced models and systems. Results 
are presented that are based on 

(1) using the most elementary model 0 n 
a neuron (one that sums its A “ weigh ted ” 
inputs and output s the re s ult thro ugh a 
nonlinearity), 

(2) interconnecting these “simple neu¬ 
rons” in a network topology involving 
feedba ck, and 

(3) employing vario us rules by whic h 
weights are adjus teaifHansTthe way in 
winch learning—self-adaptation, self¬ 
organization—occurs). 

Applications such as visual pattern recog¬ 
nition, speech recognition, motion detec¬ 
tion, adaptive pattern recognition, as well 
as VLSI and simulation implementations 
of artificial neural systems are presented. 

I hope you enjoy reading these articles. 
Clearly, this newly reincarnated f ield has 
some interesting and promising results to 
share, but it is not known how these results 
will scale up to more real-world related 
tasks. Numerous important research ques¬ 
tions emerge from even this small sampling 
of articles. For example: Are specific 
models more appropriate for given classes 
of computations than other models? How 
does the sample set of learning situations 
affect the resulting characteristics of the 
neural system both during training and 
during operation? How can supervised 
and unsupervised learning be combined in 
such systems? How do the various inter¬ 
connection structures affect the computa¬ 
tional and operational characteristics of 
the system? What hardware is best for sup¬ 
porting the particular neural-network 
models? What algorithms can be formu¬ 
lated using the massively parallel paradigm 
of neural networks? And on and on... □ 
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The “Neural” Phonetic 
Typewriter 

Teuvo Kohonen 

Helsinki University of Technology 


I 


n 1930 a Hungarian scientist, 
Tihamer Nemes, filed a patent appli¬ 
cation in Germany for the principle 
of making an ojjtoelectrical system aut o¬ 
matically transcribe speecmHis idea was 


the optical sound track on a movie 
film as a grating to produce diffraction 
patterns (corresponding to speech spec¬ 
tra), which then could be identified and 
typed out. The application was turned 
down as “unrealistic.” Since then the 
problem of automatic speech recognition 
has occupied the minds of scientists and 
engineers, both amateur and professional. 

Research on speech recognition princi¬ 
ples has been pursued in many laborato¬ 
ries around the world, academic as well as 
industrial, with various objectives in 
mind. 1 One ambitious goal is to imple¬ 
ment automated query systems that could 
be accessed through public telephone lines, 
because some telephone companies have 
observed that telephaBe-oo crators ^spend 
most of the ir time _aas wering que ries. An 
even more ambitious plan, adopted in 
1986 by the Japanese national ATR 
(Advanced Telecommunication Research) 
project, is to recei ve speech in one l an- 

Hfie/The dream of a phonenc typewrit er 
that can produce texfTrom arbitrary dic- 
tation is an old one; it was envisioned by 
Nemes and is still being pursued today. 
Several dozen devices, even special 
microcircuits, that can recognize isolated 


Based on a neural 
network processor for 
the recognition of 
phonetic units of 
speech, this speaker- 
adaptive system 
transcribes dictation 
using an unlimited 
vocabulary. 


Recently, researchers have placed great 
hopes on artificial neural networks to per¬ 
form such “natural” tasks as speech 
recognition. This was indeed one motiva¬ 
tion for us to start research in this area 
many years ago at Helsinki University of 
Technology. This article describes the 
result of that research—a comp lete “neu ¬ 
r al” speech recogni tion' system, which 
recognizes phonetic “units, called pho- 


Although motivated by neural network 
principles, the choices in its design must be 
regarded as a compromise of many tech¬ 
nical aspects of those principles. As our 
system is a genuine “phonetic typewriter” 
intended to transcribe orthographically 
edited text from an unlimited vocabulary, 
it cannot be directly compared with any 
more conventional, word-based system 
that applies classical concepts such as 
dynamic time warping 1 and hidden Mar¬ 
kov models. 2 


words from limited vocabularies with 
varying accuracy are now on the market. 
These devices have important applica¬ 
tions, such as the operation of machi nes by 
voice, various dispatching - services that 
erhpl oy voice-activat ed devices, and aids 
for seri ously handi capped people. But in 
spite of big investments and the work of 
experts, the original goals have not been 
reached. High-level speech recognition has 
existed so far only in science fiction. 


Why is speech 
recognition difficult? 

Automatic recognition of speech 
'B elongstoThFb roader category of pattern 
recognitio n tasks, 3 for which, during the 
past 30 years or so, many heuristic and 
even sophisticated methods have been 
tried. It may seem strange that while prog¬ 
ress in many other fields of technology has 
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been astoundingly rapid, research invest¬ 
ments in these “natural” tasks have not 
yet yielded adequate dividends. After ini¬ 
tial optimism, the researchers in this area 
have gradually become aware of the many 
difficulties to be surmounted. 

Human beings’ recognition of speech 
consists of many tasks, ranging from the 
detection of phonemes from speech wave¬ 
forms to the high-level understanding of 
messages. We do not actually hear all 
speech elements; we realize this easily 
when we try to decipher foreign or uncom¬ 
mon utterances. Instead, we continuously 
relate fragmentary sensory stimuli to con¬ 
texts familiar from various experiences, 
and we unconsciously test and reiterate our 
perceptions at different levels of abstrac¬ 
tion. In other words, what we believe we 
hear, we in fact reconstruct in our minds 
from pieces of received information. 

Even in clear speech from the same 
speaker, distributions of the spectral sam¬ 
ples of different phonemes overlap. Their 
statistical density functions are not Gaus¬ 
sian, so they cannot be approximated ana¬ 
lytically. The same phonemes spoken by 
different persons can be confused too; for 
example, the /e/ of one speaker might 
sound like the /n/ of another. For this rea¬ 
son, absolutely speaker-independent 
detection of phonemes is possible only 
with relatively low accuracy. 

Some phonemes are spectrally clearer 
and stabler than others. For speech recog¬ 
nition purposes, we distinguish three 
acoustically different categories: 

(1) Vocal (voiced, nonturbulent) pho¬ 
nemes, including the vowels, semivowels 
(/j/. /v/), nasals (/m/, /n/, /rj/), and 
liquids (/l/, /r/) 

(2) Fricatives (/s/, / V/, /z/, etc.) 

(3) Plosives (/k/, /p/, /t/, /b/, 
/d/,/g/, etc.) 

The phonemes of the first two categories 
have rather well-defined, stationary spec¬ 
tra, whereas the plosives are identifiable 
only on the basis of their transient proper¬ 
ties. For instance, for /k,p,t/ there is a 
silence followed by a short, faint burst of 
voice characteristic of each plosive, 
depending on its point of articulation (lips, 
tongue, palate). The transition of the 
speech signal to the next phoneme also 
varies among the plosives. 

A high-level automatic speech recogni¬ 
tion system also should interpret the 
semantic content of utterances so that it 
can maintain selective attention to partic¬ 
ular portions of speech. This ability would 
call for higher thinking processes, not only 


Machine 
interpretation of 
complete sentences 
has been 

accomplished only 
with artificially 
limited syntax. 


imitation of the operation of the preatten- 
tive sensory system. The first large exper¬ 
imental speech-understanding systems 
followed this line of thought (see the report 
of the ARPA project, 4 which was com¬ 
pleted around 1976), but for commercial 
application such solutions were too expen¬ 
sive. Machine interpretation of the mean¬ 
ing of complete sentences is a very difficult 
task; it has been accomplished only when 
the syntax has been artificially limited. 
Such ‘ ‘party tricks” may have led the pub¬ 
lic to believe that practical speech recog¬ 
nition has reached a more advanced level 
than it has. Despite decades of intensive 
research, no machine has yet been able to 
recognize general, continuous speech 
produced by an arbitrary speaker, when 
no speech samples have been supplied. 

Recognition of the speech of arbitrary 
speakers is much more difficult than 
generally believed. Existing commercial 
f speaker-independent systems are restricted 
to isolated words from vocabularies not 
exceeding 40 words. Reddy and Zue esti¬ 
mated in 1983 that for speaker- 
independent recognition of connected 
speech, based on a 20,000-word vocabu¬ 
lary, a computing power of 100,000 MIPS, 
corresponding to 100 supercomputers, 
would be necessary. 5 Moreover, the 
detailed programs to perform these oper¬ 
ations have not been devised. The difficul¬ 
ties would be even greater if the 
vocabularies were unlimited, if the utter¬ 
ances were loaded with emotions, or if 
speech were produced under noisy or 
stressful conditions. 

We must, of course, be aware of these 
difficulties. On the other hand, we would 
never complete any practical speech recog¬ 
nizer if we had to attack all the problems 
simultaneously. Engineering solutions are 


therefore often restricted to particular 
tasks. For instance, we might wish to 
recognize isolated commands from a 
limited vocabulary, or to type text from 
dictation automatically. Many satisfactory 
techniques for speaker-specific, isolated- 
word recognition have already been devel¬ 
oped. Systems that type English text from 
clear dictation with short pauses between 
the words have been demonstrated. 6 
Typing unlimited dictation in English is 
another intriguing objective. Systems 
designed for English recognize words as 
complete units, and various grammatical 
forms such as plural, possessive, and so 
forth can be stored in the vocabulary as 
separate word tokens. This is not possible 
in many other languages—Finnish and 
Japanese, for example—in which the 
grammar is implemented by inflections 
and there may be dozens of different 
forms of the same root word. For i nflec- 
tional languages the system must construct 
^USTexFfrorffrecSgnized phonetic units, 
taking into account the transformations of 
these units due to coarticulation effects 
(i.e., a phoneme’s acoustic spectrum varies 
in the context of different phonemes). 

Especially in image analysis, but in 
speech recognition too, many newer 
methods concentrate on structural and 
syntactic relationships between the pattern 
elements, and special grammars for their 
analysis have been developed. It seems, 
however, that the first step, preanalysis 
and detection of primary features such as 
acoustic spectra, is still often based on 
rather coarse principles, without careful 
consideration of the very special statistical 
properties of the natural signals and their 
clustering. Therefore, when new, highly 
parallel and adaptive methods such as arti¬ 
ficial neural networks are introduced, we 
assume that their capacities can best be uti¬ 
lized if the networks are made to adapt to 
the real data, finding relevant features in 
the signals. This was in fact one of the cen¬ 
tral assumptions in our research. 

To recapitulate, speech is a very difficult 
stochastic process, and its elements are not 
unique at all. The distributions of the 
different phonemic classes overlap seri¬ 
ously, and to minimize misclassification 
errors, careful statistical as well as struc¬ 
tural analyses are needed. 

The promise of neural 
computers 

Because the brain has already imple¬ 
mented the speech recognition function 
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(and many others), some researchers have 
reached the straightforward conclusion 
that artificial neural networks should be 
able to do the same, regarding these net¬ 
works as a panacea for such “natural” 
problems. Many of these people believe 
that the only bottleneck is computing 
power, and some even expect that all the 
remaining problems will be solved when, 
say, optical neural computers, with a vast 
computing capacity, become feasible. 
What these people fail to realize is that we 
may not yet have discovered what biolog¬ 
ical neurons and neural systems are like. 
Maybe the machines we call neural net¬ 
works and neural computers are too sim¬ 
ple. Before we can utilize such computing 
capacities, we must know what and howto 
compute. 

It is true that intriguing simulations of 
new information-processing functions, 
based on artificial neural networks, have 
been made, but most of these demonstra¬ 
tions have been performed with artificial 
data that are separable into disjoint 
classes. Difficulties multiply when natural, 
stochastic data are applied. In my own 
experience the quality of a neural network 
must be tested in an on-line connectio n 
with a natural environ ment. One of the 
most difficult problems is dealing with 
input data whose statistical density func¬ 
tions overlap, have awkward forms in 
high-dimensional signal spaces, and are 
not even stationary. Furthermore, in prac¬ 
tical applications the number of samples 
of input data used for training cannot be 
large; for instance, we cannot expect that 
every user has the patience to dictate a 
sufficient number of speech samples to 
guarantee ultimate accuracy. 

On the other hand, since digital comput¬ 
ing principles are already in existence, they 
should be used wherever they are superior 
to biological circuits, as in the syntactic 
analysis of symbolic expressions and even 
in the spectral analysis of speech wave¬ 
forms. The discrete Fourier transform has 
very effective digital implementations. 

Our choice was to try neural networks 
in a task in which the most demanding 
statistical analyses are performed— 
namely, in the optimal detection of the 
phonemes. In this task we could test some 
new learning methods that had been 
shown to yield a recognition accuracy 
comparable to the decision-theoretic max¬ 
imum, while at the same time performing 
the computations by simple elements, 
using a minimal amount of sample data 
for training. 


In practical 
neural-network 
applications, the 
number of 
input samples 
used for training 
cannot be large. 


Acoustic preprocessing 

Physiological research on hearing has 
revealed many details that may or may not 
be significant to artificial speech recogni¬ 
tion. The main operation carried out in 
human hearing is a frequency analysis 
based on the resonances of the basilar 
membrane of the inner ear. The spectral 
decomposition of the speech signal is 
transmitted to the brain through the audi¬ 
tory nerves. Especially at lower frequen¬ 
cies, however, each peak of the pressure 
wave gives rise to separate bursts of neu¬ 
ral impulses; thus, some kind of time- 
domain information also is transmitted by 
the ear. On the other hand, a certain degree 
of synchronization of neural impulses to 
the acoustic signals seems to occur at all 
frequencies, thus conveying phase infor¬ 
mation. One therefore might stipulate that 
the artificial ear contain detectors that 
mimic the operation of the sensory recep¬ 
tors as fully as possible. 

Biological neural networks are able to 
enhance signal transients in a nonlinear 
fashion. This property has been simulated 
in physical models that describe the 
mechanical properties of the inner ear and 
chemical transmission in its neural cells. 7,8 
Nonetheless, we decided to apply conven¬ 
tional frequency analysis techniques, as 
such, to the preprocessing of speech. The 
main motivations for this approach were 
that the digital Fourier analysis is both 
accurate and fast and the fundamentals of 
digital filtering are well understood. Stan¬ 
dard digital signal processing has been 
considered sufficient in acoustic engineer¬ 
ing and telecommunication. Our decision 
was thus a typical engineering choice. We 
also believed the self-organizing neural 


network described here would accept 
many alternative kinds of preprocessing 
and compensate for modest imperfections, 
as long as they occur consistently. Our 
final results confirmed this belief; at least 
there were no large differences in recogni¬ 
tion accuracies between stationary and 
transient phonemes. 

Briefly, the complete acoustic 
preprocessor of our system consists of the 
following stages: 

(1) Noise-canceling microphone 

(2) Preamplifier with a switched- 
capacitor, 5.3-kHz low-pass filter 

(3) 12-bit analog-to-digital converter 
with 13.02-kHz sampling rate 

(4) 256-point fast Fourier transform, 
computed every 9.83 ms using a 256-point 
Hamming window 

(5) Logarithmization and filtering of 
spectral powers by fourth-order elliptic 
low-pass filters 

(6) Grouping of spectral channels into 
a 15-component real-pattern vector 

(7) Subtraction of the average from all 
components 

(8) Normalization of the resulting vec¬ 
tor into constant length 

Operations 3 through 8 are computed by 
the signal processor chip TMS 32010 (our 
design is four years old; much faster 
processors are now available). 

In many speech recognition systems 
acoustic preprocessing encodes the speech 
signal into so-called LPC (linear predictive 
coding) coefficients, 1 which contain 
approximately the same information as the 
spectral decomposition. We preferred the 
FFT because, as will be shown, one of the 
main operations of the neural network that 
recognizes the phonemes is to perform 
metric clustering of the phonemic samples. 
The FFT, a transform of the signal, 
reflects its clustering properties better than 
a parametric code. 

We had the option of applying the over¬ 
all root-mean-square value of the speech 
signal as the extra sixteenth component in 
the pattern vector; in this way we expected 
to obtain more information on the tran¬ 
sient signals. The recognition accuracy 
remained the same, however, within one 
percent. We believe that the acoustic 
processor can analyze many other speech 
features in addition to the spectral ones. 
Another trick that improved accuracy on 
the order of two percent was to make the 
true pattern vector out of two spectra 30 
ms apart in the time scale. Since the two 
samples represent two different states of 
the signal, dynamic information is added 
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Figure 1. Voronoi tessellation partitions a two-dimensional (£,, | 2 ) “pattern space’’ 
into regions around reference vectors, shown as points in this coordinate system. 
All vectors (li, | 2 ) in the same partition have the same reference vector as their 
nearest neighbor and are classified according to it. The solid and open circles, 
respectively, represent reference vectors of two classes, and the discrimination 
“surface” between them is drawn in bold. 


to the preanalysis. 

Because the plosives must be distin¬ 
guished on the basis of the fast, transient 
parts of the speech waveform, we selected 
the spectral samples of the plosives from 
the transient regions of the signal, on the 
basis of the constancy of the waveform. 
On the other hand, there is evidence that 
the biological auditory system is sensitive 
not only to the spectral representations of 
speech but to their particular transient fea¬ 
tures too, and apparently it uses the non¬ 
linear adaptive properties of the inner ear, 
especially its hair cells, the different trans¬ 
mission delays in the neural fibers, and 
many kinds of neural gating in the audi¬ 
tory nuclei (processing stations between 
the ear and the brain). For the time being, 
these nonlinear, dynamic neural functions 
are not understood well enough to warrant 
the design of standard electronic analogies 
for them. 


Vector quantization 

The instantaneous spectral power values 
on the 15 channels formed from the FFT 


can be regarded as a 15-dimensional real 
vector in a Euclidean space. We might 
think that the spectra of the different pho¬ 
nemes of speech occupy different regions 
of this space, so that they could be detected 
by some kind of multidimensional dis¬ 
crimination method. In reality, several 
problems arise. One of them, as already 
stated, is that the distributions of the spec¬ 
tra of different phonemic classes overlap, 
so that it is not possible to distinguish the 
phonemes by any discrimination method 
with 100 percent certainty. The best we can 
do is to divide the space with optimal dis¬ 
crimination borders, relative to which, on 
the average, the rate of misclassifications 
is minimized. It turns out that analytical 
definition of such (nonlinear) borders is 
far from trivial, whereas neural networks 
can define them very effectively. Another 
problem is presented by the coarticulation 
effects discussed later. 

A concept useful for the illustration of 
these so-called vector space methods for 
pattern recognition and neural networks is 
called Voronoi tessellation. For simplicity, 
consider that the dissimilarity of two or 
more spectra can be expressed in terms of 


their vectorial difference (actually the 
norm of this difference) in an n- 
dimensional Euclidean space. Figure 1 
exemplifies a two-dimensional space in 
which a finite number of reference vectors 
are shown as points, corresponding to 
their coordinates. This space is partitioned 
into regions, bordered by lines (in general, 
hyperplanes) such that each partition con¬ 
tains a reference vector that is the nearest 
neighbor to any vector within the same 
partition. These lines, or the midplanes of 
the neighboring reference vectors, consti¬ 
tute the Voronoi tessellation, which 
defines a set of discrimination or decision 
surfaces. This tessellation represents one 
kind of vector quantization, which gener¬ 
ally means quantization of the vector space 
into discrete regions. 

One or more neighboring reference vec¬ 
tors can be made to define a category in the 
vector space as the union of their respec¬ 
tive partitions. Determination of such 
reference vectors was the main problem on 
which we concentrated in our neural net¬ 
work research. There are, of course, many 
classical mathematical approaches to this 
problem. 3 In very simple and straightfor¬ 
ward pattern recognition, samples, or pro¬ 
totypes, of earlier observed vectors are 
used as such for the reference vectors. For 
the new or unknown vector, a small num¬ 
ber of its nearest prototypes are sought; 
then majority voting is applied to them to 
determine classification. A drawback of 
this method is that for good statistical 
accuracy an appreciable number of refer¬ 
ence vectors are needed. Consequently, the 
comparison computations during classifi¬ 
cation, expecially if they are made serially, 
become time-consuming; the unknown 
vector must be compared with all the refer¬ 
ence vectors. Therefore, our aim was to 
describe the samples by a much smaller 
representative set of reference vectors 
without loss of accuracy. 

Imagine now that a fixed number of dis¬ 
crete neurons is in parallel, looking at the 
speech spectrum, or the set of input sig¬ 
nals. Imagine that each neuron has a tem¬ 
plate, a reference spectrum with respect to 
which the degree of matching with the 
input spectrum can be defined. Imagine 
further that the different neurons com¬ 
pete, the neuron with the highest match¬ 
ing score being regarded as the “winner. ” 
The input spectrum would then be 
assigned to the winner in the same way that 
an arbitrary vector is assigned to the closest 
reference vector and classified according 
to it in the above Voronoi tessellation. 
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There are neural networks in which such 
templates are formed adaptively, and 
which perform this comparison in paral¬ 
lel, so that the neuron whose template 
matches best with the input automatically 
gives an active response to it. Indeed, the 
self-organizing process described below 
defines reference vectors for the neurons 
such that their Voronoi tessellation sets 
near-optimal decision borders between the 
classes—i.e., the fraction of input vectors 
falling on the wrong side of the borders is 
minimized. In classical decision theory, 
theoretical minimization of the probabil¬ 
ity for misclassification is a standard 
procedure, and the mathematical setting 
for it is the Bayes theory of probability. In 
what follows, we shall thus point out that 
the vector quantization and nearest neigh¬ 
bor classification resulting in the neural 
network defines the reference vectors in 
such a way that their Voronoi tessellation 
very closely approximates the theoretical 
Bayesian decision surfaces. 

The neural network 

Detailed biophysical analysis of the 
phenomena taking place at the cell mem¬ 
brane of biological neurons leads to sys¬ 
tems of nonlinear differential equations 
with dozens of state variables for each neu¬ 
ron; this would be untenable in a computa¬ 
tional application. Obviously it is 
necessary to simplify the mathematics, 
while retaining some essentials of the real 
dynamic behavior. The approximations 
made here, while reasonably simple, are 
still rather “neural” and have been 
influential in many intriguing appli¬ 
cations. 

Figure 2 depicts one model neuron and 
defines its signal and state variables. The 
input signals are connected to the neuron 
with different, variable “transmittances” 
corresponding to the coupling strengths of 
the neural junctions called synapses. The 
latter are denoted by py (here i is the index 
of the neuron and j that of its input). Cor¬ 
respondingly, |,y is the signal value (signal 
activity, actually the frequency of the neu¬ 
ral impulses) at they'th input of the z'th 
neuron. 

Each neuron is thought to act as a puls e- 
fre guencv mod ulator, producing an out¬ 
put activity »), (actually a train of neural 
impulses with this repetition frequency), 
which is obtained by integrating the input 
signals according to the following 
differential equation. (The biological neu¬ 
rons have an active membrane with a 


capacitance that integrates input currents 
and triggers a volley of impulses when a 
critical level of depolarization is achieved.) 

dry/dt= £p, 7 |, 7 - y( n ,) (1) 


The first term on the right corresponds 
to the coupling of input signals to the neu¬ 
ron through the different transmittances; 
a linear, superpositive effect was assumed 
for simplicity. The last term, -y(q,), 
stands for a nonlinear leakage effect that 
describes all nonideal properties, such as 
saturation, leakage, and shunting effects 
of the neuron, in a simple way. It is 
assumed to be a stronger than linear func¬ 
tion of r\j. It is further assumed that the 
inverse function y ~ 1 exists. Then if the | y 
are held stationary, or they are changing 
slowly, we can consider the case dr\i/dt ~ 
0, whereby the output will follow the inte¬ 
grated input as in a nonlinear, saturating 
amplifier according to 


m ( 2 ) 

Here o[.j is the inverse function of y, and 
it usually has a typical sigmoidal form, 
with low and high saturation limits and a 
proportionality range between. 

The settling of activity according to 
Equation 1 proceeds very quickly; in bio¬ 
logical circuits it occurs in tens of milli¬ 
seconds. Next we consider an adaptive 
process in which the transmittances py are 
assumed to change too. This is the effect 
regarded as “learning” in neural circuits, 
and its time constants are much longer. In 
biological circuits this process corresponds 
to changes in proteins and neural struc¬ 
tures that typically take weeks. A simple, 
natural adaptation law that already has 
suggested many applications is the follow¬ 
ing: First, we must stipulate that paramet¬ 
ric changes occur very selectively; thus 
dependence on the signals must be non¬ 
linear. The classical choice made by most 
modelers is to assume that changes are 
proportional to the product of input and 
output activities (the so-called law of 
Hebb). However, this choice, as such, 
would be unnatural because the 
parameters would change in one direction 
only (notice that the signals are positive). 
Therefore it is necessary to modify this 
law—for example, by including some kind 
of nonlinear “forgetting” term. Thus we 
can write 



Figure 2. Symbol of a theoretical neu¬ 
ron and the signal and system variables 
relating to it. The small circles cor¬ 
respond to the input connections, the 
synapses. 


dpjdt = an.hj -Pfa) Pi j (3) 

where a is a positive constant, the first 
term is the “Hebbian” term, and the last 
term represents the nonlinear “forgetting” 
effect, which depends on the activity rj,; 
forgetting is thus “active.” As will be 
pointed out later, the first term defines 
changes in the py in such a direction that 
the neuron tends to become more and 
more sensitive and selective to the partic¬ 
ular combination of signals presented 
at the input. This is the basic adaptive 
effect. 

On the other hand, to stabilize the out¬ 
put activity to a proper range, it seems very 
profitable for /J(q,) to be a scalar function 
with a Taylor expansion in which the con¬ 
stant term is zero. Careful analyses have 
shown that this kind of neuron becomes 
selective to the so-called largest principal 
component of input. 9 For many choices 
of the functional form, it can further be 
shown that the py will automatically 
become normalized such that the vector 
formed from the py during the process 
tends to a constant length (norm) indepen- 
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activities rj„ denoting the feedback coup¬ 
ling from neuron k to neuron i by w ki , can 
be written 
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Figure 3. (a) Neural network underlying the formation of the phonotopic maps 
used in speech recognition, (b) The strengths of lateral interaction as a function of 
distance (the “Mexican hat” function). 


dent of the signal values that occur in the 
process. 9 We shall employ this effect a bit 
later in a further simplification of the 
model. 

One cannot understand the essentials of 
neural circuits unless one considers their 
behavior as a collective system. An exam¬ 
ple occurs in the “self-organizing feature 
maps” in our speech recognition applica¬ 
tion. Consider Figure 3a, where a set of 
neurons forms a layer, and each neuron is 
connected to its neighbors in the lateral 
direction. We have drawn the network 
one-dimensionally for clarity, although in 
all practical applications it has been two- 
dimensional. The external inputs, in the 
simplest model used for pattern recogni¬ 
tion, are connected in parallel to all the 
neurons of this network so that each neu¬ 
ron can simultaneously “look” at the 
same input. (Certain interesting but much 
more complex effects result if the input 
connections are made to different portions 
of the network and the activation is 
propagated through it in a sequence.) 


The feedback connections are coupled 
to the neurons in the same way as the exter¬ 
nal inputs. However, for simplicity, only 
the latter are assumed to have adaptive 
synapses. If the feedbacks were adaptive, 
too, this network would exhibit other more 
complex effects. 9 It should also be 
emphasized that the biological synaptic 
circuits of the feedbacks are different from 
those of the external inputs. The time- 
invariant coupling coefficient of the feed¬ 
back connections, as a function of dis¬ 
tance, has roughly the “Mexican hat” 
form depicted in Figure 3b, as in real neu¬ 
ral networks. For negative coupling, 
signal-inverting elements are necessary; in 
biological circuits inversion is made by a 
special kind of inhibitory interneuron. If 
the external input is denoted 

h = I Mo (4) 


then the system equation for the network 


drt/dt = Ij + 1 w ki r} k - y(x),) (5) 

keSj 

where k runs over the subset S, of those 
neurons that have connections with neu¬ 
ron A characteristic phenomenon, due 
to the lateral feedback interconnections, 
will be observed first: The initial activity 
distribution in the network may be more 
or less random, but over time the activity 
develops into clusters or “bubbles” of a 
certain dimension, as shown in Figures 4 
and 5. If the interaction range is not much 
less than the diameter of the network, the 
network activity seems to develop into a 
single bubble, located around the maxi¬ 
mum of the (smoothed) initial activity. 

Consider now that there is no external 
source of activation other than that 
provided by the input signal connections, 
which extend in parallel over the whole 
network. According to Equations 1 and 2, 
the strength of the initial activation of a 
neuron is proportional to the dot product 
mfx where w, is the vector of the p,y, x is 
the vector of the |,y, and Fis the transpose 
of a vector. (We use here concepts of 
matrix algebra whereby m, and x are 
column vectors.) Therefore, the bubble is 
formed around those units at which mfx 
is maximum. 

The saturation limits of o[.] defined by 
Equation 2 stabilize the activities r), to 
either a low or a high value. Similarly, 
P(rii) takes on either of two values. With¬ 
out loss of generality, it is possible to re¬ 
scale the variables | u and to make 
rj;e{0,l}, 0(>i,)e{O,a}, whereby Equa¬ 
tion 3 will be further simplified and split 
in two equations: 

dnij/dt = adj - HiJ) (6a) 

if rfi = 1 and ft = a (inside the 
bubble) 

dnij/dt = 0 (6b) 

for rj, = p = 0 (outside the bubble) 


It is evident from Equation 6 that the 
transmittances \x u then adaptively tend to 
follow up the input signals | y. In other 
words, these neurons start to become selec¬ 
tively sensitized to the prevailing input pat¬ 
tern. But this occurs only when the bubble 
lies over the particular neuron. For 
another input, the bubble lies over other 
neurons, which then become sensitized to 
that input. In this way different parts of 
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the network are automatically “tuned” to 
different inputs. 

The network will indeed be tuned to 
different inputs in an ordered fashion, as 
if a continuous map of the signal space 
were formed over the network. The con¬ 
tinuity of this mapping follows from the 
simple fact that the vectors w, of contig¬ 
uous units (within the bubbles) are modi¬ 
fied in the same direction, so that during 
the course of the process the neighboring 
values become smoothed. The ordering of 
these values, however, is a very subtle 
phenomenon, the proof or complete 
explanation of which is mathematically 
very sophisticated 9 and cannot be given 
here. The effect is difficult to visualize 
without, say, an animation film. A con¬ 
crete example of this kind of ordering is the 
phonotopic map described later in this 
article. 


Shortcut learning 
algorithm 

In the time-continuous process just 
described, the weight vectors attain 
asymptotic values, which then define a 
vector quantization of the input signal 
space, and thus a classification of all its 
vectors. In practice, the same vector quan¬ 
tization can be computed much more 
quickly from a numerically simpler algo¬ 
rithm. The bubble is equivalent to a neigh¬ 
borhood set N c of all those network units 
that lie within a certain radius from a cer¬ 
tain unit c. It can be shown that the size of 
the bubble depends on the interaction 
parameters, and so we can reason that the 
radius of the bubble is controllable, even¬ 
tually being definable as some function of 
time. For good self-organizing results, it 
has been found empirically that the radius 
indeed should decrease in time monoton- 
ically. Similarly a = a(t) ought to be a 
monotonically decreasing function of 
time. Simple but effective choices for these 
functions have been determined in a series 
of practical experiments. 9 

As stated earlier, the process defined by 
Equation 1 normalizes the weight vectors 
rrij to the same length. Since the bubble is 
formed around those units at which mjxi s 
maximum, its center also coincides with 
that unit for which the norm of the vec¬ 
torial difference x-rri; is minimum. 

Combining all the above results, we 
obtain the following shortcut algorithm. 
Let us start with random initial values m, 
= m,( 0). For t = 0, 1, 2, ..., compute: 


Figure 4. Development of the distribution of activity over time (t) into a stable 
“bubble” in a laterally interconnected neural network (cf. Figure 3). The activities 
of the individual neurons ( 17 ,) are shown in the logarithmic scale. 
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Figure 5. “Bubbles” formed in a two-dimensional network viewed from the top. 
The dots correspond to neurons, and their sizes correspond to their activity. In the 
picture on the right, the input was changing slowly, and the motion of the bubble is 
indicated by its “tail.” 


(1) Center of the bubble (c): 
IM')-«c(f)8=min {||x(/)-m,(/)||} (7a) 

(2) Updated weight vectors: 
m,(t+ 1) = m,(t) + a(t) (x(t) - m,(t)) 

for /' e N c 
m,(t+ 1) = m,(t) 

for all other indices i (7b) 


As stated above, a = a(0 and N c = 
N c (t) are empirical functions of time. The 
asymptotic values of the m, define the 
vector quantization. Notice, too, that 
Equation 7a defines the classification of 
input according to the closest weight vec- 

We must point out that if N c contained 
the index /' only. Equations 7a and 7b 
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Figure 6. The signal of natural speech is preanalyzed and represented on 15 spectral 
channels ranging from 200 Hz to 5 kHz. The spectral powers of the different chan¬ 
nel outputs are presented as input to an artificial neural network. The neurons are 
tuned automatically, without any supervision or extra information, to the acoustic 
units of speech identifiable as phonemes. In this set of pictures the neurons cor¬ 
respond to the small rectangular subareas. Calibration of the map was made with 
50 samples of each test phoneme. The shaded areas correspond to histograms of 
responses from the map to certain phonemes (white: maximum). 


would superficially resemble the classical 
vector quantization method called k- 
means clustering . 10 The present method, 
however, is more general because the cor¬ 
rections are made over a wider, dynami¬ 
cally defined neighborhood set, or bubble 
N c , so that an ordered mapping is 
obtained. Together with some fine adjust¬ 
ments of the nij vectors, 9 spectral recogni¬ 
tion accuracy is improved significantly. 


Phonotopic maps 

For this discussion we assume that a lat¬ 
tice of hexagonally arranged neurons 
forms a two-dimensional neural network 
of the type depicted in Figure 3. As already 
described, the microphone signal is first 
converted into a spectral representation, 
grouped into 15 channels. These channels 
together constitute the 15-component 


stochastic input vector x, a function of 
time, to the network. The self-organizing 
process has been used to create a “topo¬ 
graphic,’ ’ two-dimensional map of speech 
elements onto the network. 

Superficially this network seems to have 
only one layer of neurons; due to the 
lateral interactions in the network, how¬ 
ever, its topology is in effect even more 
complicated than that of the famous mul¬ 
tilayered Boltzmann machines or back- 
propagation networks. 11 Any neuron in 
our network is also able to create an inter¬ 
nal representation of input information in 
the same way as the “hidden units” in the 
backpropagation networks eventually do. 
Several projects have recently been 
launched to apply Boltzmann machines to 
speech recognition. We should learn in the 
near future how they compete with the 
design described here. 

The input vectors x, representing short- 
time spectra of the speech waveform, are 
computed in our system every 9.83 milli¬ 
seconds. These samples are applied in 
Equations 7a and 7b as input data in their 
natural order, and the self-organizing pro¬ 
cess then defines the w„ or the weight vec¬ 
tors of the neurons. One striking result is 
that the various neurons of the network 
become sensitized to spectra of different 
phonemes and their variations in a two- 
dimensional order, although teaching was 
not done by the phonemes; only spectral 
samples of input were applied. The reason 
is that the input spectra are clustered 
around phonemes, and the process finds 
these clusters. The maps can be calibrated 
using spectra of known phonemes. If then 
a new or unknown spectrum is presented 
at the inputs, the neuron with the closest 
transmittance vector /w, gives the 
response, and so the classification occurs 
in accordance with the Voronoi tessella¬ 
tion in which the m, act as reference vec¬ 
tors. The values of these vectors very 
closely reflect the actual speech signal 
statistics. 11 Figure 6 shows the calibration 
result for different phonemic samples as a 
gray-level histogram of such responses, 
and Figure 7 shows the map when its neu¬ 
rons are labeled according to the majority 
voting for a number of different 
responses. 

The speech signal is a continuous wave¬ 
form that makes transitions between var¬ 
ious states, corresponding to the 
phonemes. On the other hand, as stated 
earlier, the plosives are detectable only as 
transient states of the speech waveform. 
For that reason their labeling in Figure 7 
is not reliable. Recently we solved the 
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Figure 7. The neurons, shown as circles, are labeled with the symbols of the pho¬ 
nemes to which they “learned” to give best responses. Most neurons give a unique 
answer; the double labels here show neurons that respond to two phonemes. Dis¬ 
tinction of /k,p,t/ from this map is not reliable and needs the analysis of the tran¬ 
sient spectra of these phonemes by an auxiliary map. In the Japanese version there 
are auxiliary maps for /k,p,t/, /b,d,g/, and /m,n,rj/ for more accurate analysis. 





Figure 8. Sequence of the responses obtained from the phonotopic map when the 
Finnish word humppila was uttered. The arrows correspond to intervals of 9.83 
milliseconds, at which the speech waveform was analyzed spectrally. 


problem of more accurate detection of 
plosives and certain other phonemic cate¬ 
gories by using special, auxiliary maps in 
which only a certain category of phonemes 
was represented, and which were trained 
by a subset of samples. For this purpose we 
first detect the presence of such phonemes 
(as a group) from the waveform, and then 
we use this information to activate the cor¬ 
responding map. For instance, the occur¬ 
rence of /k,p,t/ is indicated by low signal 
energy, and the corresponding spectral 
samples are picked from the transient 
regions following silence. The nasals as a 
group are detectable by responses obtained 
from the middle area of the main map. 

Another problem is segmentation of the 
responses from the map into a standard 
phonemic transcription. Consider that the 
spectral samples are taken at regular inter¬ 
vals every 9.83 milliseconds, and they are 
first labeled in accordance with the cor¬ 
responding phonemic spectra. These 
labeled samples are called quasiphonemes ; 
in contrast, the duration of a true phoneme 
is variable, say, from 40 to 400 millise¬ 
conds. We have used several alternative 
rules for the segmentation of quasipho¬ 
neme sequences into true phonemes. One 
of them is based on the degree of stability 
of the waveform; most phonemes, let 
alone plosives, have a unique stationary 
state. Another, more heuristic method is 
to decide that if m out of n successive 
quasiphonemes are the same, they cor¬ 
respond to a single phoneme; e.g., m = 4 
and n =7 are typical values. 

The sequences of quasiphonemes can 
also be visualized as trajectories over the 
main map, as shown in Figure 8. Each 
arrowhead represents one spectral sample. 
For clarity, the sequence of coordinates 
shown by arrows has been slightly 
smoothed to make the curves more con¬ 
tinuous. It is clearly discernible that con¬ 
vergence points of the speech waveform 
seem to correspond to certain (stationary) 
phonemes. 

This kind of graph provides a new 
means, in addition to some earlier ones, 
for the visualization of the phonemes of 
speech, which may be useful for speech 
training and therapy. Profoundly deaf 
people may find it advantageous to have 
an immediate visual feedback from their 
speech. 

It may be necessary to point out that the 
phonotopic map is not the same thing as 
the so-called formant maps used in pho¬ 
netics. The latter display the speech signal 
in coordinates that correspond to the two 
lowest formants, or resonant frequencies 


of the vocal tract. Neither is this map any 
kind of principal component graph for 
phonemes. The phonotopic map displays 
the images of the complete spectra as 
points on a plane, the distances of which 


approximately correspond to the vectorial 
differences between the original spectra; so 
this map should rather be regarded as a 
similarity graph , the coordinates of which 
have no explicit interpretation. 
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Figure 9. The coprocessor board for the neural network and the postprocessing 
functions. 



Figure 10. Block diagram of the coprocessor board. A/D: anaiog-to-digital con¬ 
verter. TMS320: Texas Instruments 32010 signal processor chip. RAM/ROM: 4K- 
word random-access memory, 256-word programmable read-only memory. 
EPROM: 64K-byte electrically erasable read-only memory. DRAM: 512K-byte 
dual-port random-access memory. SRAM: 96K-byte paged dual-port random- 
access memory. 80186: Intel microprocessor CPU. 8256: parallel interface. 


Actually, the phoneme recognition 
accuracy can still be improved by three or 
four percent if the templates m, are fine- 
tuned: small corrections to the responding 
neurons can be made automatically by 
turning their template vectors toward x if 
a tentative classification was correct, and 
away from x if the result was wrong. 

Postprocessing in 
symbolic form 

Even if the classification of speech spec¬ 
tra were error-free, the phonemes would 
not be identifiable from them with 
100-percent reliability. This is because 
there are coarticulation effects in speech: 
the phonemes are influenced by neighbor¬ 
ing phonemes. One might imagine it pos¬ 
sible to list and take into account all such 
variations. But there may be many 
hundreds of different frames or contexts 
of neighboring phonemes in which a par¬ 
ticular phoneme may occur. Even this, 
however, is an optimistic figure since the 
neighbors too may be transformed by 
other coarticulation effects and errors. 
Thus, the correction of such transformed 
phonemes should be made by reference to 
some kind of context-sensitive stochastic 
grammar, the rules of which are derived 
from real examples. I have developed a 
program code that automatically con¬ 
structs the grammatical transformation 
rules on the basis of speech samples and 
their correct reference transcriptions. 12 A 
typical error grammar may contain 15,000 
to 20,000 rules (productions), and these 
rules can be encoded as a data structure or 
stored in an associative memory. The 
optimal amount of context is determined 
automatically for each rule separately. No 
special hardware is needed; the search of 
the matching rules and their application 
can be made in real time by efficient and 
fast software methods, based on so-called 
hash coding, without slowing down the 
recognition operation. 

The two-stage speech recognition sys¬ 
tem described in this article is a genuine 
phonetic typewriter, since it outputs ortho¬ 
graphic transcriptions for unrestricted 
utterances, the forms of which only 
approximately obey certain morphologi¬ 
cal rules or regularities of a particular lan¬ 
guage. We have implemented this system 
for both Finnish and (romanized) Japa¬ 
nese. Both of these languages, like Latin, 
are characterized by the fact that their 
orthography is almost identical to their 
phonemic transcription. 


20 


COMPUTER 

































Neural Networks Begin with 

MIT/Bradford Books 


NEURAL NETWORKS AND 
NATURAL INTELLIGENCE 

Stephen Grossberg 
This anthology of the latest research in 
neural networks is packed with real-time 
computer simulations and rigorous dem¬ 
onstrations, covering results in vision, 
speech, cognitive information process¬ 
ing, adaptive pattern recognition, adaptive 
robotics, conditioning and attention, cogni¬ 
tive-emotional interactions, and decision 
making under risk. 

$29.95 

NEUROCOMPUTING 

Foundations of Research 
edited by James A. Anderson and 
Edward Rosenfeld 
A fundamental reference work that col¬ 
lects seminal work by McCulloch and Pitts, 

Hebb, Lashley, von Neumann, Minsky and 
Papert, Cooper, Grossberg, Kohonen, and 
McClelland and Rumelhart. 

$55.00 

NATURAL COMPUTATION 

Selected Readings 
edited by Whitman A. Richards 

This extensive book of readings combines mathematics, artificial intelligence, 
computer science, experimental psychology, and neurophysiology in studying 
perception. 

$25.00 paper ($50.00 cloth) 



As a complete speech recognition 
device, our system can be made to operate 
in either of two modes: (1) transcribing 
dictation of unlimited text, whereby the 
words (at least in some common idioms) 
can be connected, since similar rules are 
applicable for the editing of spaces 
between the words (at present short pauses 
are needed to insert spaces); and (2) 
isolated-word recognition from a large 
vocabulary. 

In isolated-word recognition we first use 
the phonotopic map and its segmentation 
algorithm to produce a raw phonemic 
transcription of the uttered word. Then 
this transcription is compared with refer¬ 
ence transcriptions earlier collected from 
a great many words. Comparison of partly 
erroneous symbolic expressions (strings) 
can be related to many standard similar¬ 
ity criteria. Rapid prescreening and spot¬ 
ting of the closest candidates can again be 
performed by associative or hash-coding 
methods; we have introduced a very effec¬ 
tive error-tolerant searching scheme called 
redundant hash addressing, by which a 
small number of the best candidates, 
selected from vocabularies of thousands of 
items, can be located in a few hundred mil¬ 
liseconds (on a personal computer). After 
that, the more accurate final comparison 
between the much smaller number of can¬ 
didates can be made by the best statistical 
methods. 

Hardware 

implementations and 
performance 

The system’s neural network could, in 
principle, be built up of parallel hardware 
components that behave according to 
Equations 5 and 6. For the time being, no 
such components have been developed. 
On the other hand, for many applications 
the equivalent functions from Equations 
7a and 7b are readily computable by fast 
digital signal processor chips; in that case 
the various neurons only exist virtually, as 
the signal processors are able to solve their 
equations by a timesharing principle. Even 
this operation, however, can be performed 
in real time, especially in speech 
processing. 

The most central neural hardware of our 
system is contained on the coprocessor 
board shown in Figure 9. Its block dia¬ 
gram is shown in Figure 10. Only two sig¬ 
nal processors have been necessary: one 
for the acoustic preprocessor that 

March 1988 


PARALLEL DISTRIBUTED 
PROCESSING 

Explorations in the Microstructure 
of Cognition 
Volume 1: Foundations 
David E. Rumelhart, 

James L. McClelland, and the 
PDP Research Group 
Volume 2: Psychological and 
Biological Models 
James L. McClelland, 

David E. Rumelhart, and the 
PDP Research Group 
$16.95 each volume, paper 
$29.95 the set 

EXPLORATIONS IN 
PARALLEL DISTRIBUTED 
PROCESSING 

A Handbook of Models, 

Programs, and Exercises 
James L. McClelland and 
David E. Rumelhart 
$27.50 paper, software included for 
IBM PC 


Two MIT classics now available 
in paperback 

PERCEPTRONS 

Expanded Edition 
Marvin L Minsky and 
Seymour Papert 
“The best place to begin reviewing 
neural networks is the late 1960s. In 
their land mark book, Minsky and 
Papert examined the notion of build¬ 
ing 'thinking machines’ by joining 
together computational units that 
mimic human neurons .”—IEEE Expert 
$12.50 paper 

EMBODIMENTS 
OF MIND 

Warren S. McCulloch 
Preface by Jerome Y. Lettvin 
Another major work of the 1960s that 
teems with concepts that are highly 
relevant to current developments in 
neuroscience and neural networks. 
$12.50 paper 


To order call toll free: 800 - 356-0343 or (617) 253-2884. 
MasterCard and Visa accepted. 


The MIT Press 

55 Hayward Street, Cambridge, MA 02142 


Reader Service Number 2 







produces the input pattern vectors x, and 
another for timeshared computation of 
the responses from the neural network. 
For the time being, the self-organized com¬ 
putation of the templates m„ or “learn¬ 
ing,” is made in an IBM PC 
AT-compatible host processor, and the 
transmittance parameters (synaptic trans- 
mittances) are loaded onto the coproces¬ 
sor board. Newer designs are intended to 
operate as stand-alone systems. A stan¬ 
dard microprocessor CPU chip on our 
board takes care of overall control and 
data routing and performs some 
preprocessing operations after FFT (such 
as logarithmization and normalization), as 
well as segmenting the quasiphoneme 
strings and deciding whether the auxiliary 
transient maps are to be used. Although 
the 80186 is a not-so-effective CPU, it still 
has extra capacity for postprocessing oper¬ 
ations: it can be programmed to apply the 
context-sensitive grammar for unlimited 
text or to perform the isolated-word recog¬ 
nition operations. 

The personal computer has been used 
during experimentation for all post¬ 
processing operations. Nonetheless, the 
overall recognition operations take place 
in near real time. In the intended mode of 
operation the speech recognizer will only 
assist the keyboard operations and com¬ 
municate with the CPU through the same 
channel. 

One of the most serious problems with 
this system, as well as with any existing 
speech recognizer, is recognition accuracy, 
especially for an arbitrary speaker. After 
postprocessing, the present t ranscrip tion 
a ccuracy var ies betwe en 92 and 97 perceht, 
depending on speaker aikl dlfficulty'of 
text. We performed most of the experi¬ 
ments reported here with half a dozen male 
speakers, using office text, names, and the 
most frequent words of the language. The 
number of tests performed over the years 
is inestimable. Typically, thousands of 
words have been involved in a particular 
series of tests. Enrollment of a new speaker 
requires dictation of 100 words, and the 
learning processes can proceed concur¬ 
rently with dictation. The total learning 
time on the PC is less than 10 minutes. 
During learning, the template vectors of 
the phonotopic map are tuned to the new 
samples. 

Isolated-word recognition from a 
1000-word vocabulary is possible with an 
accuracy of 96 to 98 percent. Since the 
recognition system forms an intermediate 
symbolic transcription that can be com¬ 
pared with any standard reference tran¬ 


scriptions, the vocabulary or its active 
subsets can be defined in written form and 
changed dynamically during use, without 
the need of speaking any samples of these 
words. 

All output, for unlimited text as well as 
for isolated words, is produced in near real 
time: the mean delay is on the order of 250 
milliseconds per word. It should be noticed 
that contemporary microprocessors 
already have much higher speeds (typically 
five times higher) than the chips used in 
our design. 

To the best of our knowledge, this sys¬ 
tem is the only existing complete speech 
recognizer that employs neural computing 
principles and has been brought to a com¬ 
mercial stage, verified by extensive tests. 
Of course, it still falls somewhat short of 
expectations; obviously some kind of lin¬ 
guistic postprocessing model would 
improve its performance. On the other 
hand, our principal aim was to demon¬ 
strate the highly adaptive properties of 
neural networks, which allow a very 
accurate, nonlinear statistical analysis of 
real signals. These properties ought to be 
a goal of all practical “neurocom¬ 
puters. ”□ 
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Neural Nets for Adaptive 
Filtering and Adaptive 
Pattern Recognition 

Bernard Widrow, Stanford University 
Rodney Winter, United States Air Force 


T he fields of adaptive signal 
processing and adaptive neural 
networks have been developing 
independently but have the adaptive linear 
combiner (ALC) in common. With its 
inputs connected to a tapped delay line, the 
ALC becomes a key component of an 
adaptive filter. With its output connected 
to a quantizer, the ALC becomes an adap¬ 
tive threshold element or adaptive neuron. 

Adaptive filters have enjoyed great 
commercial success in the signal process¬ 
ing field. All high-speed modems now use 
adaptive equalization filters. Long¬ 
distance telephone and satellite communi¬ 
cations links are being equipped with 
adaptive echo cancelers to filter out echo, 
allowing simultaneous two-way commu¬ 
nications. Other applications include noise 
canceling and signal prediction. 

Adaptive threshold elements, on the 
other hand, are the building blocks of neu¬ 
ral networks. Today neural nets are the 
focus of widespread research interest. 
Areas of investigation include pattern 
recognition and trainable logic. Neural 
network systems have not yet had the com¬ 
mercial impact of adaptive filtering. 

The commonality of the ALC to adap¬ 
tive signal processing and adaptive neural 
networks suggests the two fields have 
much to share with each other. This arti¬ 
cle describes practical applications of the 
ALC in signal processing and pattern 
recognition. 


I-: 


A new multilayer 
adaptation algorithm 
that descrambles 
output and reproduces 
original patterns is 
advancing the 
practicality of neural- 
network pattern- 
recognition systems. 


The adaptive linear 
combiner 

The ALC shown in Figure 1 is the basic 
building block for most adaptive systems. 
The output is a linear combination of the 
many input signals. The weighting coeffi¬ 
cients comprise a weight vector. The input 
signals comprise an input signal vector. 
The output signal is the inner product or 
dot product of the input signal vector with 
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the weight vector. The output signal is 
compared with a special input signal called 
the desired response, and the difference is 
the error signal. To optimize performance, 
the ALC’s weighting coefficients or 
weights are generally adjusted to minimize 
the mean square of the error signal. Of the 
many adaptive algorithms to adjust the 
weights automatically, the most popular is 
the Widrow-Hoff LMS (least mean 
square) algorithm devised in 1959. 1 

Adaptive filters. Digital signals gener¬ 
ally originate from sampling continuous 
input signals by analog-to-digital conver¬ 
sion. Digital signals are often filtered by 
means of a tapped delay line or transver¬ 
sal filter, as shown in Figure 2a. The sam¬ 
pled input signal is applied to a string of 
delay boxes, each delaying the signal by 
one sampling period. An ALC is seen con¬ 
nected to the taps between the delay boxes. 
The filtered output is a linear combination 
of the current and past input signal sam¬ 
ples. By varying the weights, the impulse 
response from input to output is directly 
controllable. Since the frequency response 
is the Fourier transform of the impulse 
response, controlling the impulse response 
controls the frequency response. The 
weights are usually adjusted so that the 
output signal will provide the best least- 
squares match over time to the desired- 
response input signal. 

The literature reports many other forms 
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Figure 1. Adaptive linear combiner (ALC). 
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Figure 2. Adaptive digital filter: (a) details of a tapped-delay-line digital filter; (b) 
symbolic representation of an adaptive filter. 


of adaptive filters. 1 Some use feedback to 
obtain both poles and zeros. The filter of 
Figure 2a realizes only zeros. Adaptive 
filters based on lattice structures achieve 
more rapid convergence under certain con¬ 
ditions. However, the simplest, most 
robust, and most widely used filter is that 
of Figure 2a adapted by the LMS algo¬ 
rithm. Figure 2b shows a symbolic repre¬ 
sentation of an adaptive filter. 

Adaptive threshold element. Figure 3 
shows an adaptive threshold element, a 
key component in adaptive pattern recog¬ 
nition systems. It consists of an adaptive 
linear combiner cascaded with a quantizer. 
The output of the ALC is quantized to pro¬ 
duce a binary “decision. ’ ’ Most often, the 
inputs are binary and the desired response 
is binary. As such, the adaptive threshold 
element is trainable and capable of imple¬ 
menting binary logic functions. The LMS 
algorithm was originally developed to 
train the adaptive threshold element of 
Figure 3. This element was called an adap¬ 
tive linear neuron or Adaline. 2 

The adaptive threshold element was an 
early neuronal model. The adaptive 
weights were analogous to synapses. The 
input vector components related to the 
dendritic inputs. The quantized output 
was analogous to the axonal output. The 
output decision was determined by a 
weighted sum of the inputs, in much the 
same way real neurons were believed to 
behave. 

Adaptive signal 
processing 

The adaptive filter of Figure 2b has an 
input signal and produces an output sig¬ 
nal. The desired response is supplied dur¬ 
ing training. A question naturally arises: 
If the desired response were known and 
available, why would one need the adap¬ 
tive filter? Put another way, how would 
one obtain the desired response in a prac¬ 
tical application? There is no general 
answer to these questions, but studying 
successful examples provides some insight. 

Example 1—system modeling. In many 
engineering and scientific applications, a 
system of unknown structure has observ¬ 
able input and output signals. One way of 
obtaining knowledge about the unknown 
system’s dynamic response is to apply its 
input to an adaptive filter and to use its 
output as the adaptive filter’s desired 
response. (See Figure 4.) The adaptive fil- 
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ter develops an impulse response to match 
that of the unknown system since the fil¬ 
ter and the system develop similar outputs 
when driven by the same input. 

Example 2—statistical prediction. One 

can estimate future values of time- 
correlated digital signals from present and 
past input samples. Wiener 3 has devel¬ 
oped optimal linear least-squares filtering 
techniques for signal prediction. When the 
signal’s autocorrelation function is 
known, Wiener’s theory yields the impulse 
response of the optimal filter. More often 
than not, however, the autocorrelation 
function is unknown and may be time- 
variable. One could use a correlator to 
measure the autocorrelation function and 
plug this into Wiener theory to get the 
optimal impulse response, or one could get 
the optimal prediction filter directly by 
adaptive filtering. Figure 5 illustrates the 
latter approach. 

In this figure, the input signal delayed 
by A time units is fed to an adaptive filter. 
The undelayed input serves as the desired 
response for this adaptive filter. The filter 
weights adapt and converge to produce a 
best least-squares estimate of the present 
input signal, given an input that is this very 
signal delayed by A. The optimal weights 
are copied into a “slave filter” whose input 
is undelayed and whose output therefore 
is a best least-squares prediction of the 
input A time units into the future. 


Example 3—noise canceling. Separating 
a signal from additive noise is a common 



error 

signal 


Figure 4. System modeling. 
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problem in signal processing. Figure 6a 
shows a classical approach to this problem 
using optimal Wiener or Kalman filter¬ 
ing. 3 The purpose of the optimal filter is 
to pass the signal s without distortion while 
stopping the noise n 0 . In general, this can¬ 
not be done perfectly. Even with the best 


filter, the signal is distorted, and some 
noise goes through to the output. 

Figure 6b shows another approach to 
the problem using adaptive filtering. This 
approach is viable only when an additional 
“reference input” is available containing 
noise n it which is correlated with the 


original corrupting noise n 0 . In Figure 6b, 
the adaptive filter receives the reference 
noise, filters it, and subtracts the result 
from the noisy “primary input,” s+n 0 . 
For this adaptive filter, the noisy input 
s+n 0 acts as the desired response. The 
“system output” acts as the error for the 
adaptive filter. Adaptive noise canceling 
generally performs much better than the 
classical approach since the noise is sub¬ 
tracted out rather than filtered out. 

One might think that some prior knowl¬ 
edge of the signal s or of the noises n 0 and 
n i would be necessary before the filter 
could adapt to produce the noise-canceling 
signal y. A simple argument will show, 
however, that little or no prior knowledge 
of s, n 0 , or /Zj or of their interrelationships 
is required. 

Assume that s, n 0 , n { and y are statisti¬ 
cally stationary and have zero means. 
Assume that s is uncorrelated with n 0 and 
«i and suppose that n, is correlated with 
n 0 . The output is 

e = s + n 0 - y (1) 

Squaring, one obtains 

e 2 = j 2 + (n Q -y) 2 + 2s(n 0 -y) (2) 

Taking expectations of both sides of Equa¬ 
tion 2, and realizing that 5 is uncorrelated 
with n 0 and with y, yields 

E[ e 2 ] = ^[s 2 ] + El(n 0 -y) 2 ] 

+ 2E[s(n 0 -y)] 

= E^] + E[(n 0 -yf] (3) 



adaptive noise canceler 


Figure 6. Separation of sigr 
canceling approach. 


(a) classical approach; (b) adaptive noise- 


Adapting the filter to minimize £[e 2 ] will 
not affect the signal power E^s 2 ]. Accord- 
—i ingly, the minimum output power is 




Figure 7. Canceling maternal heartbeat in fetal electrocardiography: (a) cardiac 
electric field vectors of mother and fetus; (b) placement of leads. 


Emin [c 2 ] = Els 1 ] + F'mintK-y) 2 ] (4) 

When the filter is adjusted so that £[e 2 ] 
is minimized, El(n 0 -yf] is therefore also 
minimized. The filter output y is then a 
best least-squares estimate of the primary 
noise n 0 . Moreover, when El(n 0 -y) 2 ] is 
minimized, ^[(e-s) 2 ] is also minimized, 
since, from Equation 1, 

(e -s) = (n 0 -y) (5) 

Adjusting or adapting the filter to mini¬ 
mize the total output power is tantamount 
to causing the output e to be a best least- 
squares estimate of the signal 5 for the 
given structure and adjustability of the 
adaptive filter and for the given reference 
input. 
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There are many practical applications 
for adaptive noise canceling techniques. 
One involves canceling interference from 
the mother’s heart when attempting to rec¬ 
ord clear fetal electrocardiograms. Figure 
7 shows the location of the fetal and mater¬ 
nal hearts and the placement of the input 
leads. The abdominal leads provide the 
primary input (containing fetal ECG and 
interfering maternal ECG signals), and the 
chest leads provide the reference input 
(containing pure interference, the mater¬ 
nal ECG). Figure 8 shows the results. The 
maternal ECG from the chest leads was 
adaptively filtered and subtracted from the 
abdominal signal, leaving the fetal ECG. 
This was an interesting problem since the 
fetal and maternal ECG signals had spec¬ 
tral overlap. The two hearts were electri¬ 
cally isolated and worked independently, 
but the second harmonic frequency of the 
maternal ECG was close to the fundamen¬ 
tal of the fetal ECG. Ordinary filtering 
techniques would have great difficulty 
with this problem. 

Example 4—adaptive echo canceling. 

Echo is a natural phenomenon in long¬ 
distance telephone circuits because of 
amplification in both directions and series 
coupling of telephone transmitters and 
receivers at each end of the circuit. Echo 
suppressors, used to break the feedback, 
give one-way communication to the party 
speaking first. To avoid switching effects 
and to permit simultaneous two-way 
transmission of voice and data, adaptive 
echo cancelers are replacing echo suppres¬ 
sors worldwide (see Figure 9). 

In Figure 9, the delay boxes represent 
transmission delays in the long line. Note 
that separate circuits are normally used in 
each direction because the repeater ampli¬ 
fiers used to overcome transmission loss 
are one-way devices. Hybrid transformers 


prevent incoming signals from coupling 
through the telephone set and passing as 
outgoing signals. Hybrids are balanced to 
do this by designing them for the average, 
local telephone circuit. Since each local cir¬ 
cuit has its own length and electrical 
characteristics, the hybrid cannot do its 
job perfectly. Using an adaptive filter to 
cancel any incoming signal that might leak 
through the hybrid eliminates the possibil¬ 
ity of echo. The circuit of Figure 9 works 


well, allowing simultaneous two-way com¬ 
munication without echo. 

Example 5—inverse modeling. Figure 4 
showed use of an adaptive filter for direct 
modeling of an unknown system to obtain 
a close approximation to its impulse and 
frequency responses. By changing the con¬ 
figuration, it is possible to use the adaptive 
filter for inverse modeling to obtain the 
reciprocal of the unknown system’s trans- 



(a) 



Figure 8. Result of fetal ECG experiment (bandwidth, 3-35 Hz; sampling rate, 256 
Hz): (a) reference input (chest lead); (b) primary input (abdominal lead); (c) noise 
canceler output. 



Figure 9. Long-distance system with adaptive echo cancellation. 
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fer function. The idea is illustrated in Fig¬ 
ure 10. The output of the unknown system 
is the input to the adaptive filter. The 
unknown system’s input delayed 
by A time units is the desired response of 
the adaptive filter. For simplicity, assume 


that A is set to zero delay. To make the 
error small, the cascade of the unknown 
system and the adaptive filter needs a unity 
transfer function. Therefore, using an 
adaptive algorithm to make the error small 
causes the adaptive filter to develop a 


transfer function that is the inverse of that 
of the unknown system. 

If the response of the unknown system 
contains a delay or is nonminimum phase, 
allowing a nonzero delay A will be highly 
advantageous. Including A delays the 
inverse impulse response but yields a much 
lower mean-square error. In some appli¬ 
cations, one would like to make this delay 
as small as possible. In other applications, 
this delay is of no concern except that it 
should be chosen to minimize the mean- 
square error. Applications for inverse 
modeling exist in the field of adaptive con¬ 
trol, in geophysical signal processing, 
where it is called “deconvolution,” and in 
telecommunications for channel equali¬ 
zation. 

Example 6—channel equalization. Tel¬ 
ephone channels, radio channels, and even 
fiber optic channels can have non-flat fre¬ 
quency responses and nonlinear phase 
responses in the signal passband. Sending 
digital data at high speed through these 
channels often results in a phenomenon 
called “intersymbolinterference,” caused 
by signal pulse smearing in the dispersive 


I. 


— 


input 

signal 



Figure 10. Inverse modeling. 



Figure 11. Adaptive channel equalizer with decision-directed learning. 
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medium. Equalization in data modems 
combats this phenomenon by filtering 
incoming signals. A modem’s adaptive fil¬ 
ter, by adapting itself to become a channel 
inverse, can compensate for the irregular¬ 
ities in channel magnitude and phase 
response. 

The adaptive equalizer in Figure 11 con¬ 
sists of a tapped delay line with an adap¬ 
tive linear combiner connected to the taps. 
Deconvolved signal pulses appear at the 
weighted sum, which is quantized to pro¬ 
vide a binary output corresponding to the 
original binary data transmitted through 
the channel. The ALC and its quantizer 
comprise a single Adaline. Any least- 
squares algorithm can adapt the weights, 
but the telecommunications industry uses 
the LMS algorithm almost exclusively. 

In operation, the weight at a central tap 
is generally fixed at unit value. Initially, all 
other weights are set to zero so that the 
equalizer has a flat frequency response and 
a linear phase response. Without equaliza¬ 
tion, telephone channels can provide 
quantized binary outputs that reproduce 
the transmitted data stream with error 
rates of 10“ 1 or less. As such, the quan¬ 
tized binary output can be used as the 
desired response to train the neuron. It is 
a noisy desired response initially. Sporadic 
errors cause adaptation in the wrong direc¬ 
tion, but on average, adaptation proceeds 
correctly. As the neuron learns, noise in 
the desired response diminishes. Once the 
adaptive equalizer converges, the error 


rate will typically be 10“ 6 or less. The 
method, called “decision-directed” learn¬ 
ing, was invented by Robert W. Lucky of 
AT&T Bell Labs. 4 

Figure 12a shows the analog response of 
a telephone channel carrying high-speed 
binary pulse data. Figure 12b shows an 
“eye” pattern, which is the same signal 
after going through a converged adaptive 
equalizer. Equalization opens the eye and 
allows clear separation of + 1 and - 1 
pulses. Using a modem with an adaptive 
equalizer enables transmitting about four 
times as much data through the same chan¬ 
nel with the same reliability as without 
equalization. 

Integrated services digital network 
(ISDN), a new concept now in develop¬ 
ment and deployment, makes high-speed 
digital communication possible through 
ordinary local copper telephone circuits. 
ISDN requires both adaptive equalization 
and adaptive echo canceling at each line 
termination. The number of adaptive 
filters to be used in the world’s telecommu¬ 
nications plant will be massive. 


Adaptive pattern 
recognition 

The adaptive threshold element of Fig¬ 
ure 3 can be used for pattern recognition 
and as a trainable logic device. It can 
be trained to classify input patterns into 


two categories. For these applications, 
the zeroth weight, w 0 , has a constant 
input x 0 = +1 which does not change 
from input pattern to pattern. Varying 
the zeroth weight varies the threshold 
level of the quantizer. 

Linear separability. With n binary 
inputs and one binary output, a single neu¬ 
ron of the type shown in Figure 3 is capa¬ 
ble of implementing certain logic 
functions. There are 2" possible input pat¬ 
terns. A general logic implementation 
would be capable of classifying each pat¬ 
tern as either + 1 or - 1, in accord with the 
desired response. Thus, there are 2 2 " pos¬ 
sible logic functions connecting n inputs to 
a single output. A single neuron is capable 
of realizing only a small subset of these 
functions, known as the linearly separable 
logic functions. 5 These are the set of logic 
functions that can be obtained with all pos¬ 
sible settings of the weight values. 

Figure 13 shows a two-input neuron, 
and Figure 14 shows all of its possible 
binary inputs in pattern vector space. In 
this space, the coordinate axes are the com¬ 
ponents of the input pattern vector. The 
neuron separates the input patterns into 
two categories, depending on the values of 
the input-signal weights and the bias 
weight. A critical thresholding condition 
occurs when the analog response y equals 
zero: 

y = X\ Wi + x 2 w 2 + w 0 = 0 (6) 



Figure 12. Eye patterns produced by overlaying cycles of the received waveform: (a) before equalization; (b) after equalization. 


March 1988 


















:.x 2 = - (w-oM) - (wiM)*, (7) 
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Figure 14. Separating line in pattern space. 






Figure 14 graphs this linear , relation, 
which comprises a separating line having 
slope and intercept of 

slope = - (wi/w 2 ) ( g ) 

intercept = - (wq/wj) 


The three weights determine slope, inter¬ 
cept, and the side of the separating line 
that corresponds to a positive output. The 
opposite side of the separating line cor¬ 
responds to a negative output. 

The input/output mapping obtained in 
Figure 14 illustrates an example of a 
linearly separable function. An example of 
a nonlinearly separable function with two 
inputs is the following: 


(+ 1 , + 1 )- +1 
(+1. -D^-l 
(- 1 , - 1 )- +1 
(- 1 , + 1 )- -1 


(9) 


No single line exists that can achieve this 
separation of the input patterns. 

With two inputs, a single neuron can 
realize almost all possible logic functions. 
With many inputs, however, only a small 
fraction of all possible logic functions are 
linearly separable. The single neuron can 
realize only linearly separable functions 
and generally cannot realize most func¬ 
tions. However, combinations of neurons 
or networks of neurons can be used to real¬ 
ize nonlinearly separable functions. 


Nonlinear separability—Madaline net¬ 
works. In the early 1960s at Stanford, 
Ridgway 6 initiated an approach to the 
implementation of nonlinearly separable 
logic functions. He connected retinal 
inputs to adaptive neurons in a single layer 
and, in turn, connected their outputs to a 
fixed logic device providing the system 
output. Methods for adapting such nets 
were developed at that time. In the exam¬ 
ple network shown in Figure 15, two Ada- 
lines are connected to an AND logic device 
to provide an output. Systems of this type 
were called Madalines (many Adalines). 
Today such systems would be called small 
neural nets. 

With weights suitably chosen, the 
separating boundary in pattern space for 
the system of Figure 15 would be as shown 
in Figure 16. This separating boundary 
implements the nonlinearly separable logic 
function of Equation 9. 

Madalines were constructed with many 
more inputs, with many more neurons in 
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the first layer, and with fixed logic devices 
such as AND, OR, and MAJority vote- 
taker in the second layer. Those three func¬ 
tions, as illustrated in Figure 17, are in 
themselves threshold logic functions. The 
weights given will implement these func¬ 
tions, but the weight choices are not 
unique. 

Layered neural nets. The Madalines of 
the 1960s had adaptive first layers and 
fixed threshold functions for the second 
(output) layers. 6 ’ 7 The feed-forward neu¬ 
ral nets of the 1980s have many layers, and 
all layers are adaptive. The back- 
propagation method of Rumelhart et al. 8 
is perhaps the best-known example of a 
multilayer network. A three-layer feed¬ 
forward adaptive network is illustrated in 
Figure 18. 

Adapting the neurons in the output 
layer is simple, since the desired responses 
for the entire network (given with each 
input training pattern) are the desired 
responses for the corresponding output 
neurons. Given the desired responses, 
adaptation of the output layer can be a 
straightforward exercise of the LMS algo¬ 
rithm. The fundamental difficulty 
associated with adapting a layered net¬ 
work lies in obtaining desired responses 
for the neurons in the layers other than the 
output layer. The back-propagation algo¬ 
rithm (first reported by Werbos 9 and later 
rediscovered by Parker 10 and by Rumel¬ 
hart et al. 8 ) is one method for establishing 
desired responses for the neurons in the 
“hidden layers,” those layers whose neu¬ 
ronal outputs do not appear directly at the 
system output (see Figure 18). There is 
nothing unique about the choice of desired 
outputs for the hidden layers. 

Generalization in layered networks is a 
key issue. The question is, how well do 
multilayered networks perform with 
inputs for which they were not specifically 
trained? The question of generalization is 
important, and theorists are developing 
some good examples where useful gener¬ 
alizations take place. Many different 
algorithms may be needed for the adapta¬ 
tion of multilayered networks to produce 
required generalizations. Without gener¬ 
alization, neural nets will be of little 
engineering significance. Merely learning 
the training patterns can be accomplished 
by storing these patterns and their 
associated desired responses in a look-up 
table. 

The layered networks of Rumelhart et 
al. 8 use neuronal elements like the Ada- 
line of Figure 3, except that the quantizer 
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Figure 17. Neuronal implementation of AND, OR, and MAJ logic functions. 
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Figure 18. A three-layer adaptive neural network. 


or threshold device is a soft-limiting “sig¬ 
moid” function rather than the hard- 
limiting “signum” function of the Ada- 
line. The various back-propagation 
algorithms for adapting layered networks 
of neurons require differentiability along 
the network’s signal paths and cannot 
work with the Adaline element’s hard 
limiter. The sigmoid function has the 
necessary differentiability. However, it 
presents implementational difficulties if 
the neural net is ultimately constructed 
digitally. For this reason, we developed a 
new algorithm for adapting layered 
networks of Adaline neurons with hard- 
limiting quantizers. The new algorithm is 
an extension of the original Madaline 
adaptation rule 6,7 and is called Madaline 
rule II or MRII. The idea is to adapt the 
network to properly respond to the newest 
input pattern while minimally disturbing 
the responses already trained-in for the 
previous input patterns. Unless this prin¬ 
ciple is practiced, it is difficult for the net¬ 
work to simultaneously store all of the 
required pattern responses. 

LMS or Widrow-Hoff delta rule for the 
single neuron. The LMS algorithm applied 
to the adaptation of the weights of a sin¬ 
gle neuron embodies a minimal distur¬ 


bance principle. This algorithm can be 
written as 

= W k + at k X k /\X k \ 2 (10) 

The time index or adaption cycle number 
is A:. W k+X is the next value of the weight 
vector, W k is the present value of the 
weight vector, X k is the present input pat¬ 
tern vector, and e* is the present error 
(that is, the difference between the desired 
response d k and the analog output before 
adaptation). Applying the above recursion 
formula to each adaption cycle reduces the 
error by the fraction a. That is, at the Arth 
cycle, the error is 

e k = d k -XlW k (11) 

Changing the weights changes (reduces) 
the error: 

Ac* = A (d k - X T k W k ) 

= -X T k b.W k (12) 

In accordance with the LMS rule (Equa¬ 
tion 10), the weight change is as follows: 

A IF* = W k+X - W k 

= ae k X k /\X k \ 2 (13) 


Combining Equations 12 and 13, we 
obtain 

Ac* = — X T k at k X k /1 A* | 2 

= — ae. k X k X k / |A*| 2 (14) 

= -ac* 

Therefore, the error is reduced by a factor 
of a as the weights are changed while hold¬ 
ing the input pattern fixed. Putting in a 
new input pattern starts the next adapta¬ 
tion cycle. The next error is then reduced 
by a factor of a, and the process continues. 

The choice of a controls stability and 
speed of convergence. 1 Stability requires 
that 

2 > a > 0 (15) 

Making a greater than 1 generally does not 
make sense, since the error would be over¬ 
corrected. Total error correction comes 
with a = 1. A practical range for a is 

1.0 > a > 0.1 (16) 

The weights change proportionately 
with their inputs in accordance with the 
LMS algorithm (see Equation 10). With 
the usual binary inputs 1 and 0, no adap¬ 
tation occurs for weights with 0 inputs. 
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Thus, the symmetric inputs +1 and -1 
are generally preferred. 

To verify that the LMS rule embodies 
the minimal disturbance principle, refer to 
Equation 13. The weight-change vector 
A W k is chosen to be parallel to the input 
pattern vector X k . From Equation 12, the 
change in the error is equal to the negative 
dot product of X k with A W k , thus achiev¬ 
ing the needed error correction with the 
smallest magnitude of weight vector 
change. ’ 'hen adapting to respond 
properly to a new input pattern, the 
responses to previous training patterns are 
therefore minimally disturbed, on the 
average. The algorithm also minimizes 
mean square error,‘the property for 
which it is best known. 

Adaptation of layered neural nets by the 
MRII rule. The minimal disturbance prin¬ 
ciple can b applied to the adaptation of 
Figure 18’s layered neural network. 
Presenting an input pattern and its 
associated desired responses to the net¬ 
work, the training objective is to reduce 
the number of errors to as low a level as 
possible. Accordingly, when the first train¬ 
ing pattern is presented, the first layer will 
adapt as required to reduce the number of 
response errors at the final output layer. 
In accordance with the minimal distur¬ 
bance principle, the first-layer neuron 
whose analog response is closest to zero is 
given a trial adaptation in the direction to 
reverse its binary output. When the rever¬ 
sal takes place, the second-layer inputs 
change, the second-layer outputs change, 
and consequently the network outputs 
change. A check is made to see if this 
reduces the number of output errors for 
the current input pattern. If so, the trial 
change is accepted. If not, the weights are 
restored t< ,heir previous values, and the 
first-layer neuron whose analog response 
is next closest to zero is trial adapted, 
reversing its response. If this reduces the 
number of output errors, the change is 
accepted. If not, the weights are restored, 
and one goes on to adaptively switch the 
neuron with an analog response next 
closest to zero, and so on, disturbing the 
neurons as little as possible. After adapt¬ 
ing all neurons whose output reversals 
reduced the number of output errors, neu¬ 
rons are then chosen in pair combinations 
and trial adaptations are made which can 
be accepted if output errors are reduced. 
After adapting the first-layer neurons in 
singles, pairs, triples, etc., up to a predeter¬ 
mined limit in combination size (simula¬ 
tion results indicate pairwise trials are 



Figure 19. One slab of a left-right, up-down translation invariant network. 


sufficient in layers having up to 25 Ada- 
lines), the second layer is adapted to fur¬ 
ther reduce the number of network output 
errors. The method of choosing the neu¬ 
rons to be adapted in the second layer is the 
same as that for the first layer. If further 
error reduction is needed, the output layer 
is then adapted. This is straightforward, 
since the output-layer desired responses 
are the given desired responses for the 
entire network. After adapting the output 
layer, the responses will be correct. The 
next input pattern vector and its associated 
desired responses are then applied to the 
neural network, and the adaptive process 
resumes. 

When training the network to respond 
correctly to the various input patterns, the 
“golden rule” is give the responsibility to 
the neuron or neurons that can most eas¬ 
ily assume it. In other words, don’t rock 
the boat any more than necessary to 
achieve the desired training objective. 
(Simulation results using this minimal- 
disturb nee MRII algorithm are presented 
later.) 

Application of layered networks to pat¬ 
tern recognition. It would be useful to 
devise a neural net configuration that 
could be trained to classify an important 


set of training patterns as required, but 
have these network responses be invariant 
to translation, rotation, and scale change 
of the input pattern within the field of 
view. It should not be necessary to train the 
system with the specific training patterns 
of interest in all combinations of transla¬ 
tion, rotation, and scale. 

The first step is to show that a neural 
network having these properties exists. 
(The invariance methods that follow are 
extensions of results reported earlier by 
Widrow. 2 ) The next step is to obtain 
training algorithms to achieve the desired 
objectives. 

Invariance to up-down, left-right pat¬ 
tern translation. Figure 19 shows a planar 
network configuration (a “slab” of neu¬ 
rons) that could be used to map a retinal 
image into a single-bit output so that, with 
proper weights in the network’s neurons, 
the response will be insensitive to left-right 
and/or up-down translation. The same 
slab structure can be replicated, with 
different weights, to allow the retinal pat¬ 
tern to be independently mapped into 
additional single-bit outputs, all insensitive 
to left-right, up-down translation. 

Figure 20 illustrates the general idea. A 
retinal image having a given number of 
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Figure 20. A 


preprocessor network and an adaptive two-layer descrambler network. 


pixels can be mapped through an array of 
slabs into a different image having the 
same, more, or fewer pixels, depending on 
the number of slabs used. In any event, the 
mapped image is insensitive to up-down, 
left-right translation of the original image. 
The mapped image in Figure 20 is fed to a 
set of Adaline neurons that can be easily 
trained to provide output responses to the 
original image as required. This amounts 
to a “descrambling” of the preprocessor’s 
outputs. The descrambler’s output 
responses classify the original input images 
and, at the same time, are insensitive to 
their left-right, up-down translations. 

In the systems of Figures 19 and 20, the 
elements labeled “AD” are Adalines. 


Those labeled “MAJ” are majority vote- 
takers. (If the number of input lines to 
MAJ is even and there is a tie vote, these 
elements are biased to give a positive 
response.) The AD elements are adaptive 
neurons and the MAJ elements are fixed 
neurons, as in Figure 17. 

In the system shown in Figure 19, the 
structuring of the weights so that the out¬ 
put is insensitive to left-right and up-down 
translation needs further explanation. Let 
the weights of each Adaline be arranged in 
a square array and the corresponding reti¬ 
nal pixels arrayed in a square pattern. Let 
the square matrix (H^) designate the 
array of weights of the upper-left Adaline, 


and let T m (W { ) be the array of weights of 
the next lower Adaline. The operator T m 
represents “translate down one,” so the 
second set of weights is the same as the top¬ 
most set, but translated down en masse by 
one pixel. The bottom row wraps around 
to comprise the top row. The patterns on 
the retina itself wrap around on a cylinder 
when they undergo translation. The 
weights of the next lower Adaline are 
and those of the next lower 
Adaline are T m (W x ). Returning to the 
upper-left Adaline, let its neighbor to the 
right be designated by T RX (W X ), with T RX 
being a “translate right one” operator. 
The pattern of weights for the entire array 
of Adalines in Figure 19 is 
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MRII Learning Curve for an Adaptive Descrambler 



Figure 21. Learning curve for a two-layer, 25 x 25 adaptive descrambler. 


W) TViW) T R2 (W t ) T m (Wd 

T n (W t ) T n T n (Wx) T R2 T a (W ,) T R ,T a (W x ) 

T m (W t ) T m T m (W,) T R2 T m (W t ) T R1 T m (fV,) 

T m (W ,) T m T m (W,) T R1 T m (W t ) T R ,T m (W t ) 

(17) 

As the input pattern moves up, down, 
left, or right on the retina, the roles of the 
various Adalines interchange. Since all 
Adaline outputs are equally weighted by 
the MAJ element, translating the input 
pattern up-down and/or left-right on the 
retina has no effect on the MAJ element 
output. 

The set of “key” weights (W t ) can be 
randomly chosen. Once chosen, they can 
be translated according to Equation 17 to 
fill out the array of weights for the system 
of Figure 19. This array of weights can be 
incorporated as the weights for the first 
slab of Adalines shown in Figure 20. The 
weights for the second slab would require 
the same translational symmetries, but be 
based on a different randomly chosen set 
of key weights (fV 2 ). The mapping func¬ 


tion of the second slab would therefore be 
distinct from that of the first slab. 

The translational symmetries in the 
weights called for in Figure 19 could be 
fixed and manufactured in, or they could 
be arrived at through training. If, when 
designing an application-specific pattern 
recognition system, one knew that trans¬ 
lational invariance would be required, it 
would make sense to manufacture the 
appropriate symmetry into a fixed weight 
system, leaving only the final-output Ada- 
line layers plastic and trainable (see Figure 
20). Such a preprocessor would definitely 
work, would provide very high speed 
response without scanning and searching 
for pattern location and alignment, and 
would be an excellent application of neu¬ 
ral nets. 

Invariance to rotation. Figure 20 
represents a system for preprocessing reti¬ 
nal patterns with a translation-invariant 
fixed neural net followed by a two-layer 
adaptive descrambler net. The system can 


be expanded to incorporate rotational 
invariance. Suppose that all input patterns 
can be presented in “normal” vertical 
orientation, approximately centered 
within the field of view of the retina. Sup¬ 
pose further that all input patterns can be 
presented when rotated from normal by 
90,180, and 270 degrees. Thus, each pat¬ 
tern can be presented in all four rotations 
and in all possible left-right, up-down 
translations. The number of combinations 
would be large. The problem is to design 
a neural net preprocessor that is invariant 
to translation and to rotation by 90 
degrees. 

Begin with a single slab of Adaline ele¬ 
ments, as shown in Figure 19, producing 
a majority output that is insensitive to 
translation of the input pattern on the ret¬ 
ina. Next, replicate this slab four-fold, and 
let the majority outputs feed into a single 
majority output element. In the first slab, 
(W t ) designates the upper-left Adaline’s 
matrix of weights. (See Equation 17 for the 
weight matrices of all first-slab Adalines.) 
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Figure 22. A Madaline system for pattern recognition. 


In the second slab, the upper-left Adaline’s 
weight matrix corresponds to the first-slab 
weight matrix rotated 90 degrees clock¬ 
wise. This can be designated by /?ci( W^i)» 
and the corresponding third- and fourth- 
slab weight matrices can be designated by 
Ra(Wi) and Ra^W^). Thus, the weight 
matrices of the upper-left Adalines begin 
with (IFj) in the first slab, and are rotated 
clockwise by 90 degrees in the second slab, 
by 180 degrees in the third slab, and by 270 
degrees in the fourth slab. The weight 
matrices of all slabs are translated right 
and down, in the fashion of Equation 17, 
starting with the Adalines in the upper left- 


hand corner. For example, the array of 
weight matrices for the second slab is 

RnW) TMW T m RaW) r„R,W) 
T m Rn(HQ T„TnScAltQ T R2 TMW t ) T S ,TMW,) 
T m Rc,W) T m T m Rc\W t ) T^RaiW,) 

7LA,W) T„J m RcW) TnTMW) T n T m Rc ,W) 

(18) 

Clearly, translating the pattern on the 
retina does not change the majority output 
response. Rotating the pattern 90 degrees 
causes an interchange of the roles of the 
slabs in making their responses, but, since 
the output majority element weights them 


equally, the output response is unchanged. 
Insensitivity to 45-degree rotation can be 
accomplished by using more slabs; thus, a 
complete neural network providing invar¬ 
iance to rotation and translation could be 
constructed. Each translation-invariant 
slab of Figure 20 would need to be replaced 
by the rotation-invariant multiple slab and 
majority-element system described above. 

Invariance to scale. The same principles 
can be used to design invariance nets that 
are insensitive to scale or pattern size. By 
establishing a “point of expansion” on the 
retina so that input patterns can be 
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expanded or contracted with respect to this 
point, two Adalines can be trained to give 
similar responses to patterns of two differ¬ 
ent sizes if the weight matrix of one 
expands (or contracts) about the point of 
expansion like the patterns themselves. 
The amplitude of the weights must be 
scaled in inverse proportion to the square 
of the linear dimension of the retinal pat¬ 
tern. By adding many more slabs, the 
invariance net can be built around this idea 
to be insensitive to pattern size as well as 
to translation and rotation. (Implementa¬ 
tion would, of course, require the abun¬ 
dance and low cost of VLSI electronics.) 

Simulation results. The system of Fig¬ 
ure 20 was computer simulated. The train¬ 
ing set consisted of 36 patterns, each 
arranged on a 5 X 5 pixel retina in “stan¬ 
dard” position. Twenty-five slabs, each 
with 25 Adalines having weights fixed 
according to Equation 17, were used in the 
translation-invariant preprocessor. The 
preprocessor output represented a scram¬ 
bled version of the input pattern. The 
nature of this scrambling was determined 
by the choice of the key weight matrices 
((Fj), . . ., (B25). These key weights were 
chosen randomly, the only requirement 
was that the input pattern to preprocessor 
output map be one-to-one. (This choice of 
weights produced a very noise-intolerant 
mapping. We are investigating methods of 
training in the key weights, using MRII to 
customize them to the training set.) 

We used MRII to train the descrambler, 
a two-layer system with 25 Adalines in 
each layer. The initial descrambler weights 
were chosen randomly and independently, 
distributed uniformly on the interval (- 1, 
+ 1). Patterns were presented in random 
order, each pattern being equally likely of 
being the next presented. The desired 
response used was the training pattern in 
standard position. The system as a whole 
would then recognize any trained-in pat¬ 
tern in any translated position on the input 
retina and reproduce it in standard posi¬ 
tion at the output. Figure 21, a typical 
descrambler learning curve, graphs the 
number of incorrect pixels at the output, 
averaged over the training set, every 50 
pattern presentations. 

Much work on MRII remains to be 
done, including detailed studies of its con¬ 
vergence properties and its ability to pro¬ 
duce generalizations. Preliminary results 
are very encouraging. Applying the algo¬ 
rithm to problems will lead to insights that 
will, we hope, allow a mathematical anal¬ 
ysis of the algorithm. 


The concept of using an invariance 
preprocessor followed by a descrambler is 
a potentially powerful one. We plan to 
apply the concept to a speech recognition 
problem. When a word is spoken by differ¬ 
ent people or even by the same person at 
different times, the sounds produced dif¬ 
fer greatly but remain recognizable as the 
same word—at least to a human. There¬ 
fore, those sounds must have properties 
that are invariant from utterance to utter¬ 
ance. We believe a system similar to the 
one in Figure 20 would be useful in 
developing a multiuser speech recognition 
system. 

T he general pattern-recognition 
concept we’ve described involves 
use of an invariance net followed 
by a trainable classifier. Figure 22 illus¬ 
trates the key ideas. The invariance net can 
be trained or designed to produce a set of 
outputs that are insensitive to translation, 
rotation, scale change, etc., of the retinal 
pattern. These outputs are scrambled, but 
the adaptive layers can be trained to 
descramble them and reproduce the origi¬ 
nal patterns in “standard” position, orien¬ 
tation, and scale. Multilayer adaptation 
algorithms are essential to making such a 
scheme work, and we’ve devised a new 
Madaline adaptation rule—MRII—for 
that purpose. Our preliminary experimen¬ 
tal results indicate that it works and is 
effective. □ 
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VLSI Implementation of a 
Neural Network Model 

Hans P. Graf, Lawrence D. Jackel, and Wayne E. Hubbard 
AT&T Bell Laboratories 


M odels of neural networks are 
receiving widespread atten¬ 
tion as potential new 
architectures for computing systems. The 
models we consider here consist of highly 
interconnected networks of simple com¬ 
puting elements. A computation is per¬ 
formed collectively by the whole network 
with the activity distributed over all the 
computing elements. This collective oper¬ 
ation results in a high degree of parallel 
computation and gives the network the 
potential to solve complex problems 
quickly. Neural network models have 
demonstrated functions such as associa¬ 
tive memory, adaptive learning from 
examples, and combinatorial optimi¬ 
zation. 

To date, the research concerning neural 
network models has focused mainly on 
theoretical studies and computer simula¬ 
tions. However, the real promise for appli¬ 
cations of the models lies in specialized 
hardware, in particular specialized micro¬ 
electronic circuits. Simulations of large 
networks on serial computers are painfully 
slow, and only with customized hardware 
can we hope to realize neural network 
models with speeds fast enough for appli¬ 
cations. 

Digital accelerators to simulate neural 
networks are now commercially available, 
but they are still orders of magnitude 
slower than what we can achieve by 
directly fabricating a network with hard- 



To obtain the full 
benefit of neural 
network algorithms, 
we need special- 
purpose hardware. An 
experimental CMOS 
VLSI circuit was 
tested as an 
associative memory 
and as a pattern 
classifier. 


ware. Several researchers have built 
models with discrete electronic compo¬ 
nents. These implementations help us 
study properties such as the dynamics of 
these circuits, but they are too bulky for 
real applications. 

The most promising approach for 
implementing electronic neural nets is to 
fabricate special-purpose very-large-scale- 
integration chips. With today’s integration 


density, a large number of simple proces¬ 
sors can be packed on a single chip 
together with the necessary interconnec¬ 
tions to make a collective computing net¬ 
work. Several groups have initiated 
experiments with VLSI implementations 
and demonstrated a few functioning 
circuits. 1 ' 3 

Attempts are under way to build neural 
network models with optics. 4 The high 
interconnectivity of the networks makes 
optics attractive because interconnections 
can be made optically in three dimensions. 
The circuit on a microchip is bound to the 
two dimensions of the chip’s surface. 
However, optical computing technology is 
still in its infancy and realizations suitable 
for applications probably lie further in the 
future. 

We describe a complementary metal 
oxide semiconductor (CMOS) very large 
scale integrated (VLSI) circuit implement¬ 
ing a connectionist neural network model 
that consists of an array of 54 simple 
processors fully interconnected with a 
programmable connection matrix. This 
experimental design tests the behavior of 
a large network of processors integrated 
on a chip. We can operate the circuit in 
several different configurations by pro¬ 
gramming the interconnections between 
the processors. We made tests with the cir¬ 
cuit working as an associative memory and 
as a pattern classifier. The results were so 
encouraging that we interfaced the chip to 
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Figure 1. An electronic “neuron.” 


OUTPUT LINES 



Figure 2. Schematic of the implemented circuit. At each crosspoint of an input and 
an output line, a resistive connection can be set. All the connections are program¬ 
mable. The configuration of resistors shown here is just one example. Two 
inverters are connected in series and work as one amplifier unit. To have the 
inverted and the noninverted output going into the connection matrix makes it easy 
to control excitatory and inhibitory connections. 


a minicomputer and now use it as a 
coprocessor in pattern recognition experi¬ 
ments. This mode of operation allows us 
to test the chip’s behavior in a real appli¬ 
cation and study how pattern recognition 
algorithms can be mapped into such a 
network. 


Connectionist model 

Biological neural networks inspired the 
models we are implementing in electronics, 
but we are not attempting to imitate real 
neurons. Clearly, the models used grossly 
simplify the biological networks. But even 
these relatively simple networks have com¬ 
plex dynamics and show interesting collec¬ 
tive computing properties. It is this 
collective computation that we try to 
exploit by building such networks. 

The type of circuit described here is 
(often referred to as a connectionist model. 
In a connectionist model, an individual 
“neuron” does very little computation, 
typically just a thresholding of its input sig¬ 
nal. The kind of computation performed 
I by the whole network depends on the inter¬ 
connections among the neurons. Figure 1 
shows a possible electronic circuit for such 
a simple neuron. An amplifier models the 
cell body, wires replace the input structure 
(dendrite) and the output structure (axon), 
and resistors model the synaptic connec¬ 
tions between neurons. The amplifier’s 
output voltage replaces the firing rate of 
a real neuron. 

Each of the resistors connects the input 
line of the amplifier to the output line of 
another amplifier. Therefore, the state of 
an amplifier is determined by the states of 
all the other amplifiers. When the ampli¬ 
fier measures the current flowing into the 
input line, we can express this as 

Vout J =f(y i ) 

=f (Z( Voutj - VinftTij) (1) 

where Vin, Vout are the input and output 
voltages of an amplifier; /, is the current 
flowing through one resistor; Ty is the 
conductance of the resistor connecting 
amplifier i with amplifier j; and/() is the 
transfer function of the amplifier. 

Equation 1 shows how the states of the 
amplifiers, represented by Voutj, deter¬ 
mine how much current flows into the 
input line and therefore determine the state 
of this amplifier. The output voltage of the 
amplifier is then given by its transfer 
characteristics. Its output connects to the 
input of many other amplifiers and 
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influences their states. A network of highly 
interconnected amplifiers forms in this 
way. It is very difficult to describe the 
dynamics of such a system for an arbitrary 
distribution of the resistive connections, 
but many special cases can be controlled 
well and exploited for collective computa¬ 
tion. Such networks have been demon¬ 
strated as functional associative memory 
and as optimizers. 5,6 

The implemented circuit well suits work 
as an associative memory or as a pattern 
classifier. To have the network perform 
these functions does not require precise 
control of the gain of the amplifiers or the 
ability to fine tune the resistive connec¬ 
tions. This makes it possible to design 
interconnections and amplifiers using only 
a small area, so many of these components 
can be integrated on a single chip. 

The function described by Equation 1 
could be implemented in digital hardware. 
But this would require a complex 
multiplier-adder circuit at the input of each 
neuron, resulting in a large circuit even for 
a modest number of neurons. The imple¬ 
mentation described here uses a mixture of 
analog and digital CMOS VLSI tech¬ 
nology. 

Using analog computation, we can 
achieve a multiplication with a single resis¬ 
tor, and the summing of currents is accom¬ 
plished “for free” on the input wire of the 
amplifier. However, designing large inte¬ 
grated analog circuits is a difficult task. 
There is a strong tendency in signal 
processing today to avoid analog compu¬ 
tation and do everything digitally. Yet the 
high interconnectivity and relatively low 
precision needed for the signals in a neu¬ 
ral net are well tailored for an analog 
approach, and the gain in computing 
power of the network should outweigh the 
extra design effort. 


The circuit 

Figure 2 shows a schematic of the imple¬ 
mented circuit. It consists of an array of 54 
amplifiers with their inputs and outputs 
interconnected through a matrix of resis¬ 
tive coupling elements. All of the coupling 
elements are programmable—a resistive 
connection can be turned on or off. 

Figure 3 shows a photomicrograph of 
the circuit. Fabricated in CMOS technol¬ 
ogy with 2.5-micrometer design rules, it 
contains roughly 75,000 transistors in an 
area measuring 6.7 x 6.7 millimeters. By 
far the largest active area of the chip, 
almost 90 percent, is used for the program¬ 



Figure 3. Photomicrograph of the chip. The modules are: 1 = amplifiers, control 
logic, bit decoder, input bus, and output bus; 2 = decoder to address rows in the 
interconnection matrix; 3 = interconnection matrix. 
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Figure 4. One of the coupling elements connecting the output of an amplifier to the 
input of another amplifier. St and S 4 are closed when the output of the amplifier 
controlling them is high (noninverted output high). If a “1” is stored in RAM 1, S 2 
is closed and the connection is excitatory. If a “1” is stored in RAM 2, S 3 is closed 
and the connection is inhibitory. If both RAM cells store a “0,” no current flows 
through this interconnection regardless of the state of the controlling amplifier. 


mable coupling network. 

In this design the circuit shown in Fig¬ 
ure 4 replaces the resistors of Figure 2. The 
output lines of the amplifiers do not feed 
current into the input lines; instead they 
control the switches Si and S 4 . This 
method reduces the amplifier load to just 


the capacitance of the output line. For 
each connection between two amplifiers, 
two memory cells control switches S 2 and 
S 3 . The content of these memory cells 
determines the type of connection. One of 
three connections can be selected: an 
excitatory (S 2 enabled), an inhibitory (S 3 
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Figure 5. Programming of the interconnections for the pattern classifier circuit. 
The “ + ” and the “ - ” mark excitatory and inhibitory connections, respectively. 
Below the schematic the stored vectors are shown. 


enabled), or an open (both disabled) state. 
The network contains a total of 2916 such 
coupling elements. Notice that when an 
amplifier output is low, the connection 
turns off completely and no current flows 
in either direction. 

The voltage of an input line of an ampli¬ 
fier is determined by the sum of the cur¬ 
rents flowing into the node. This voltage 
adjusts to a value where the total current 
is zero. Since the input impedance of the 
amplifiers is very high, this leads to 



where Iy is the current flowing through a 
resistor of the coupling element controlled 
by amplifier /; AK U is the voltage differ¬ 
ence across a resistor, ( Virij-V DD , 
Virij- K ss ); and Ry is the resistor, (/? + , 
*-)• 

Thus, the voltage Virij is an analog 
measure of the sum of the contributions of 
all the amplifiers connected to input line 
j. The amplifiers used have a high gain and 
work essentially as threshold elements, 
with the switching threshold about half¬ 
way between the ground and the supply 
voltage. Analog computation is used only 
within the connection matrix; input and 
output data and all the control signals are 
digital. 

Data input and output are transferred 
through a register where one memory cell 


is connected to each amplifier. The input 
data are first loaded into this register. 
From there they can be loaded into the 
memory cells of the connection matrix, or 
used to initialize the circuit. Initialization 
of the circuit is done by charging the net¬ 
work with the voltage levels correspond¬ 
ing to the components of the input vector. 
The amplifiers are turned off during this 
initialization cycle. For the computation, 
the amplifiers are turned on and the net¬ 
work evolves to a stable state without any 
external control or synchronization 
between the amplifiers. After the circuit 
has reached a stable state, the output volt¬ 
age of each amplifier is stored in the reg¬ 
ister, which can then be read out. 

Programming the chip 

The circuit’s architecture facilitates 
mapping several different configurations 
into the network simply by programming 
the connections between the amplifiers. 
Figure 5 shows the arrangement of the 
interconnections for a configuration used 
to do pattern classification. The amplifiers 
are divided into two groups: the label units 
and the vector units. A number of vectors 
are stored in the circuit, each one along the 
input line of one label unit. The compo¬ 
nents of the stored vectors can have the 
values + 1, - 1, or 0. An excitatory con¬ 


nection is set for a +1 and an inhibitory 
connection for a - 1. 

The input vector is presented on the 
inputs of the vector units. Its components 
can have the values + 1 or 0. Wherever a 
+ 1 in the input vector is set, current can 
be injected or drawn from the input line of 
a label unit depending on the type of the 
connection. As described by Equation 2, 
the condition for a stable state is that the 
total current flowing into an input line 
equals 0. If the input voltage is above the 
threshold of the amplifier, this label unit 
turns on; otherwise, it remains off. 
Whether a label turns on or not is 
described by Equation 3: 


_ r >0: Vout = high 
; = o R i ~ ' <0: V ° Ut = IOW 


(3) 


where v, denotes components of the input 
vector, (+1,0); p, denotes components of 
the stored vectors, (-1,0, +1); and R, is 
the resistance of the connections, (/?_, 
R + ). 

The input vector is compared in paral¬ 
lel with all the stored vectors and an inner 
product between the input vector and the 
stored vectors is evaluated. All the stored 
vectors closely resembling the input vector 
turn on their label units.* A + 1 in the 
input vector at the position of a + 1 in the 
stored vector gives a positive contribution 
to the sum, while a - 1 in the stored vec¬ 
tor in this position gives a negative contri¬ 
bution. R + is about six times larger than 
R _. Therefore, a mismatch (a + 1 in the 
input in the position of - 1 in the stored 
vector) counts six times as much as a 
match. This ratio of R+ to R- has no 
great significance; it simply reflects the 
ratio of the resistance of the p-channel and 
the n-channel transistors of the CMOS cir¬ 
cuit. For the applications described later, 
it is convenient to have the inhibition 
stronger than the excitation, but we could 
obtain the same effect by using multiple 
connections. 

By using a few amplifiers as bias units, 
we can shift the threshold of a label unit. 
We can program the connections between 
the bias units and the label units to set 
different thresholds for each stored vector. 
The right-hand side of Equation 3 is then 
replaced by the bias value. 


•The classification properties of this circuit cor¬ 
respond to those of a single-layer perceptron; it can 
discriminate between linearly separable patterns. Since 
we are dealing with binary vectors, the configuration 
space consists of the corners of an n-dimensional 
hypercube. Decision regions of any shape can be built 
in this space as the sum of linearly separable regions. 
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The number of stored vectors and their 
lengths are limited by the number of 
amplifiers in the circuit. Each component 
of the input vector uses one amplifier, and 
each label unit needs one. Therefore, the 
number of bits in a vector plus the number 
of stored vectors must be smaller than 54. 
Usually, a few amplifiers are used for 
biases, reducing this number to approxi¬ 
mately 50. 

This arrangement uses only the connec¬ 
tions between the outputs of the vector 
units and the inputs of the label units, 
while the largest part of all the connections 
in the matrix remains idle. We can achieve 
a more efficient use of the circuit by using 
the register cells as the vector units. The 
arrangement described uses these cells only 
as intermediate storage for the input vec¬ 
tor, but they can also charge the matrix to 
its initial condition. All the amplifiers then 
work as label units. Up to 54 vectors, each 
54 bits long, can be stored in the network 
and are compared in parallel to the input 
vector. To read out the result, the register 
cells must switch quickly from writing the 
input vector to reading the result without 
feeding the result back into the connection 
matrix. This requires a precisely timed 
pulse controlling the length of time the 
amplifiers are turned on. This mode of 
operation is used for feature extraction 
from images (see the section “Examples of 
applications”). 

Adding inhibitory connections between 
the label units and connecting the outputs 
of the label units to the inputs of the vec¬ 
tor units can yield an associative memory. 
This arrangement is shown schematically 
in Figure 6. In this circuit, vectors with 
components + 1 and 0 are stored. In addi¬ 
tion to the connections along the input 
lines of the label units described above, 
there are inhibitory connections between 
all the label units. Each label unit inhibits 
all the other label units, but not itself. For 
the connections between the label outputs 
and the vector inputs, the same vector is 
placed along the output line of a label unit 
stored along its input line. For a + 1 in the 
vector, an excitatory connection is set, and 
for a 0, an inhibitory connection is set. 

The circuit is initialized with all input 
lines of the label units discharged to 
ground. The input vector is given on the 
input lines of the vector units. Wherever 
a + 1 in the input vector and in a stored 
vector occupy the same position, current 
is injected into the input line of a label unit. 
The speed at which an input line of a label 
unit changes state depends on the inner 
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Figure 6. The interconnections set for an associative memory circuit. 


product of the input vector and a stored 
vector. That is, 

~ Yin) jtw (4) 

where V in is the voltage of the input line; 
C is the capacitance of the input node of 
a label; R + is the resistance of the excita¬ 
tory connection; v, denotes the compo¬ 
nents of the input vector, (0, + 1); and 
denotes the components of the stored vec¬ 
tor, (0, + 1). 

The stored vector that has the largest 
inner product with the input vector will 
turn on its label unit first. As one label unit 
turns on, it inhibits all the other label units. 
If this inhibition is strong enough, no other 
label unit will turn on. 

As mentioned above, an inhibition is six 
times stronger than an excitation. This 
limits the allowed mutual overlap (number 
of common l’s) of two stored vectors to 
five bits; biases are used to circumvent this 
limitation. The label unit that comes on 
first generates the vector stored along its 
output wire at the inputs of the vector 
units. In this way, in a stable state one label 
unit is on and the vector connected to that 
label appears at the outputs of the vector 
units. 

This circuit performs a minimum dis¬ 
tance classification. The input vector is 


compared with all the stored vectors in 
parallel, and the circuit determines which 
of them most closely matches the input 
vector (common l’s are counted). 

Extensive tests with 10 stable states, 
each 40 bits long, programmed into the 
associative memory circuit showed that the 
circuit converges to a stable state in 50 to 
600 nanoseconds. 2 Associative recall per¬ 
formed reliably, and we observed only the 
stable states programmed into the circuit; 
we saw no spurious stable states. The con¬ 
vergence time depends on how closely the 
input vector resembles the stored states. 
Additional tests run with up to 20 stable 
states, each 30 bits long, programmed into 
this circuit obtained similar results. 

In this circuit the stored data are repre¬ 
sented locally in the interconnection 
matrix. This means that a stored bit can be 
localized at one or two of the interconnec¬ 
tions. In contrast, other circuits based on 
neural network models use a distributed 
representation of the data (e.g., the outer 
product of a stored vector determines the 
distribution of the connections). 

The great interest in associative memory 
circuits implemented with neural network 
models has resulted in the description of 
several programming methods. 5 ' 7 Experi¬ 
ments conducted with this type of associa¬ 
tive memory soon made clear the far 
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Figure 7. The interface connecting the chip to a minicomputer. The network chip is 
on the left side of the board. The other chips control the data exchange with the 
computer. 


superior results given by local representa¬ 
tion of data. With distributed representa¬ 
tion of data, the different stored vectors 
influence each other because they share the 
same interconnections. This leads to 
unwanted spurious stable states and to 
erroneous recalls of vectors that do not 
correspond to the nearest neighbor. 

These problems do not exist in the cir¬ 
cuit we describe here. We observed relia¬ 
ble recall of the nearest neighbor and no 
unwanted stable states. Moreover, the 
storage density is considerably higher with 
this circuit. With a distributed representa¬ 
tion of the data, it would hardly be possi¬ 
ble to get 20 arbitrary stable states in this 
circuit. But the arrangement described 
here still does not use the interconnections 
of the chip efficiently. None of the connec¬ 


tions among the vector units are used. In 
the case of 10 stored vectors, this means 
that less than one third of all the connec¬ 
tions are used. 

With the present chip we do not see a 
way to further increase storage efficiency. 
However, a design optimized for this new 
associative memory circuit can achieve 100 
percent storage density. 2 This means that 
one RAM cell stores one bit. In contrast, 
associative memory circuits with dis¬ 
tributed representation of data typically 
require five or more RAM cells tj store 
one bit. High storage efficiency is crucial 
for an electronic implementation. Other 
researchers have recently proposed circuits 
similar to the associative memory we have 
described. 8,9 

We can configure the circuit to move 


through a sequence of vectors by placing 
a vector along the output line of a label 
that differs from the vector along its input 
line. Mixing connections of two vectors 
can stabilize the circuit at a vector and 
make it sensitive to the next vector in the 
sequence only. When the inpui lies within 
a set range of this vector, the circuit will 
move to the next stable state, therwise, 
it just stays in its present state. Other ver¬ 
sions of programming techniques allow 
for omissions and branches in a parsed 
sequence of vectors. 10 


Examples of 
applications 

In order to evaluate the behavior of the 
circuit in an application, we used it in a 
character recognition experiment. An 
interface connected the chip to a minicom¬ 
puter to handle the transfers of the large 
amounts of data characteristic of image 
processing requests. 

Figure 7 shows the board with the net¬ 
work chip and some additional off-the- 
shelf integrated circuits to control data 
flow. Data transfers can be m..de directly 
from the minicomputer’s memory to the 
chip at a rate between one and two mega¬ 
bytes per second. This rate is limited by the 
interface and the minicomputer and not by 
the chip. One complete processing cycle, 
which includes loading the input vector, 
accomplishing a computation with the cir¬ 
cuit, and reading back the result into the 
computer, requires appropriately 25 
clock cycles or 25 to 50 micioseconds. 
Most of the time is required for reading the 
data in and out. The processing in the cir¬ 
cuit requires only one clock cycle. 

The whole process of recognizing a 
character proceeds as follows: A hand¬ 
written character is read with a camera, 
digitized, and then normalized in size to fill 
a 128 x 128-pixel frame. The image is then 
coarse-blocked into a 16 x 16-pixel binary 
image. After this, the character is 
skeletonized —the width of the lines is 
reduced to one pixel—and the skeletonized 
picture is searched for a number of geo¬ 
metrical features. The positions of these 
features are compared to a training set and 
a best match with one of these training 
characters is determined. Of this whole 
process, the line thinning and the feature 
extraction have been mapped onto the 
chip; the minicomputer accomplishes the 
remaining operations. 

Figure 8 shows an example of a result of 
the line thinning operation. The chip is 
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used in the configuration of Figure 5 for 
this task. Stored in the circuit are 20 differ¬ 
ent vectors, each representing a 5 x 5-pixel 
area. The input vector to the chip is a 
5 x 5-pixel region of the character image, 
which we will refer to as a card. Based on 
which label units are on after a run, a deci¬ 
sion is made whether or not to delete the 
center pixel of this card from the image. 
For the next run the card is shifted by one 
row or column of pixels on the image; this 
new card is used as the input vector. In this 
way the whole picture is scanned. For each 
pixel position, a processing cycle is accom¬ 
plished with the appropriate card as the 
input vector. For each pixel in the image, 
a decision has to be made whether that 
pixel makes the line fat or whether its pres¬ 
ence is crucial to keeping intact the connec¬ 
tivity of the character. 

Figure 9 shows an example of a vector 
stored in the circuit to make one such deci¬ 
sion. If the label connected to this vector 
turns on, then the center pixel is part of a 
diagonal line or a corner and that line is at 
least two pixels wide. In this situation, the 
pixel can be deleted without destroying the 
connectivity of the character. The 20 vec¬ 
tors stored on the chip analyze the neigh¬ 
borhood of each pixel for all the 
configurations that allow its deletion. All 
the computation needed to decide whether 
a pixel can be deleted is done in one clock 
cycle. A total line-thinning operation 
requires about three scans across the whole 
character, depending on the width of the 
lines because only boundary pixels are 
deleted. 

Line thinning is an important step in 
machine vision, not only in character 
recognition, but also in tasks like finger¬ 
print analysis and inspection of manufac¬ 
tured parts. Many algorithms have been 
developed to handle this problem. 11 The 
algorithm implemented with the network 
circuit does not differ fundamentally from 
other pixel-based algorithms. The stored 
vectors facilitate making the same type of 
tests as those formulated in the other 
algorithms with a set of Boolean func¬ 
tions. Most other algorithms base their 
decision on a 3 x 3 area around the pixel 
under test, since the test of larger areas 
becomes very time consuming. However, 
with the network it is not a problem to ana¬ 
lyze the larger 5x5 pixel area, since it still 
takes only one clock cycle. Using a larger 
area makes the algorithm more robust, 
and it supports integrating some smooth¬ 
ing to enhance the thinning operation. 

To accomplish the extraction of geomet¬ 
rical features, 40 vectors are stored in the 



Figure 8. The result of a line-thinning operation on a handwritten “3.” The gray 
area represents the original character, and the black area is the portion that remains 
after three thinning scans. 



Figure 9. One of the kernels stored in 
the chip for the line-thinning operation. 
If the label connected to this vector 
turns on, the center pixel is erased in the 
image. The black pixels are coded as 
excitatory connections ( +) and the gray 
pixels as inhibitory connections ( - ). 
The bias is set to - 4. The label turns on 
whenever five black pixels in the image 
correspond to the black pixels in this 
kernel and no black pixel in the image is 
at a position of a gray pixel in the ker¬ 
nel. Then, the center pixel is a part of 
the boundary of a thick diagonal line or 
a corner and can be deleted. 
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circuit. Each one looks for a feature like 
a straight line, an endpoint of a line, a cor¬ 
ner, crossing lines, or an arc. Whenever 
one of these features is present in the 
image, the label of this feature-vector turns 
on. 

The image is scanned sequentially, as 
described for the line-thinning operation. 
For every pixel position, a 7x7 pixel 
neighborhood is searched in parallel for 
the 40 different features. Such a scan 
results in 40 feature maps that indicate 
where particular features occur. To com¬ 
press this large amount of data, the reso¬ 
lution of the feature maps is then reduced 
from 16 x 16 to 3 x 3 positions. These fea¬ 
ture maps are compared with reference 
characters and the best match is chosen. 
Currently, the minicomputer accom¬ 
plishes this last matching operation, but a 
way to map this final operation onto the 
network circuit is under development. 

The successful recognition rate for 
hand-written digits is approximately 90 
percent. We assume that we can improve 
this rate considerably by using finer reso- 
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lution for the character images as well as 
the feature maps. We now use only local 
geometrical features of a maximum size of 
7x7 pixels. We can expect better results 
when large features such as long strokes or 
large circles are generated first from these 
local features. However, the main purpose 
of this experiment was to test the circuit’s 
performance and to gain experience pro¬ 
gramming different algorithms into the 
network. We have not yet pushed the anal¬ 
ysis to find the highest recognition rate. 

Character recognition based on feature 
extraction is more robust against distor¬ 
tions than simple template-matching 
algorithms that compare the pixel posi¬ 
tions of the whole character to reference 
data. Feature extraction is one of the most 
versatile functions of low-level machine 
vision and can be applied to many 
problems. 


Discussion 

We have tested the circuit with several 
programmed algorithms and conclude 
that a network like this one is reliable and 
robust enough for applications. We 
designed this circuit to study the behavior 
of a large analog network and did not 
specialize it for any particular application. 
Since input and output of data limit the 
processing, we must optimize the I/O 
structure in new designs for particular 
applications. 

The analog portion of the circuit 
represents a very powerful processor. With 
50 stored vectors, each 50 bits long, the 
chip completes a processing cycle in less 
than one microsecond. This means that the 
circuit evaluates 50 inner products between 
two 50-bit vectors per microsecond. On a 
standard microprocessor this computation 
would require more than one hundred 
instructions. 

A new chip being fabricated was 
designed specifically for image processing. 
The input vector is shifted in a shift regis¬ 
ter along the inputs of the amplifiers. In 
this way a new input vector is ready for a 
computing cycle at each clock cycle when 
a card of 8 x 12 pixels is scanned over an 
image. It is possible to store 46 vectors, 
each 96 bits long, and simulations indicate 
that a complete program cycle can be 
accomplished in approximately 100 
nanoseconds. 

This circuit will evaluate several hun¬ 
dred million inner products between two 
96-bit vectors per second. For example, the 
circuit can do the line-thinning operation 


described above at a rate of a few hundred 
microseconds per character compared to 
the few hundred milliseconds per charac¬ 
ter a standard computer takes. 10 (With 
character sizes of 32 x 32 pixels, each scan 
requires 1024 cycles if the whole area is 
scanned; about three scans are required.) 

This new design is fabricated with the 
same conservative 2.5 micrometer CMOS 
process. Switching to smaller design rules 
will allow packing considerably more cir¬ 
cuitry on a single chip. Also, since this net¬ 
work is implemented with a standard 
digital fabrication process, it can be com¬ 
bined easily with other memory and 
processor modules on the same chip to 
enhance its versatility. 

A system concept with a layered struc¬ 
ture is also under development. In this 
scheme a layer of network processors is 
followed by a memory module, with 
several of these units stacked in series. In 
an application like character recognition, 
the different tasks such as line thinning or 
feature extraction are then done in differ¬ 
ent processor layers. Each layer of proces¬ 
sors inputs the data from the memory 
module below it and outputs its results in 
the memory module above it. 

The data flows mainly in one direction, 
from the raw data at the input of the lowest 
layer to the output layer that does the pat¬ 
tern identification. However, communica¬ 
tion is also provided in the opposite 
direction, so that results from a higher 
layer can determine the operation per¬ 
formed in a lower level. This feature facili¬ 
tates scanning certain areas in the image 
with a different resolution, or scanning the 
image for different features when the 
results in the higher levels are ambiguous. 
We need additional research on the map¬ 
ping of different algorithms into the net¬ 
work, and how to format the data 
optimally to feed from one network into 
the next with minimal intermediate refor¬ 
matting. 


T he network described provides a 
flexible tool because it can evalu¬ 
ate Boolean expressions and 
arithmetic equations. Methods of statisti¬ 
cal as well as structural pattern recognition 
can be mapped into the chip. Ideas from 
the artificial intelligence community on 
bit-mapped classifiers 12 suggest that 
expert systems could be made from this 
network. With all these elements at hand, 
this network looks promising for building 
powerful recognition systems. □ 
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Computing Motion Using 
Analog and Binary Resistive 
Networks 
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Christof Koch, Jin Luo, and Carver Mead 
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T o us, and to other biological 
organisms, vision seems effort¬ 
less. We open our eyes and we 
“see” the world in all its color, brightness, 
and movement. Flies, frogs, cats, and 
humans can all equally well perceive a 
rapidly changing environment and act on 
it. Yet, we have great difficulties when try¬ 
ing to endow our machines with similar 
abilities. In this article, we describe recent 
developments in the theory of early 
vision that led from the formulation of the 
motion problem as an ill-posed one to its 
solution by minimizing certain “cost” 
functions. These cost or energy functions 
can be mapped onto simple analog and 
digital resistive networks. Thus, we can 
compute the optical flow by injecting cur¬ 
rents into resistive networks and recording 
the resulting stationary voltage distribu¬ 
tion at each node. These networks, which 
we implemented in complementary metal 
oxide semiconductor (CMOS) very large 
scale integrated (VLSI) circuits, represent 
plausible candidates for biological vision 
systems. 

Motion 

The movement of objects relative to 
eyes or cameras serves as an important 


We can compute 
optical flow by 
injecting currents into 
resistive networks and 
recording the 
stationary voltage 
distribution at 
each node. 


source of information for many tasks. We 
need motion to track objects and to deter¬ 
mine whether an object is approaching or 
receding. Relative motion contains infor¬ 
mation regarding the three-dimensional 
structure of objects and allows biological 
organisms to navigate quickly and effi¬ 
ciently through the environment. 

There exist two basic methods for com¬ 
puting motion. Intensity-based schemes 

•Hutchinson is now with Thinking Machines Corp. 


rely on spatial and temporal gradients of 
the image intensity to compute the speed 
and the direction in which each point in the 
image moves. The output is a velocity or 
motion vector field covering the entire 
image. The second method is based on the 
identification of special features in the 
image, called tokens, which are then 
matched from image to image. This 
method relies on the unambiguous iden¬ 
tification of the tokens—for instance, 
corners—in each image frame before the 
matching occurs and only yields a veloc¬ 
ity vector at the sparse token locations. 
Psychophysical evidence suggests that 
both systems coexist in humans. 1 

The principal drawback of all intensity- 
based schemes lies in the data used— 
temporal variations in brightness 
patterns—which give rise to the perceived 
motion field, the opticalflow. In general, 
the optical flow and the underlying veloc¬ 
ity field, a purely geometrical concept, dif¬ 
fer. 2 For example, a featureless rotating 
sphere will not give rise to any optical flow, 
because the brightness does not appear to 
change even though the velocity field is 
non-zero. Conversely, if a shadow moves 
across the same featureless but now sta¬ 
tionary sphere, the optical flow is non-zero 
although the velocity field is zero. Apart 
from such situations, the estimated opti- 
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cal flow will be very nearly identical to the 
underlying velocity field, if strong enough 
gradients exist in the image. In this article, 
we assume that such strong gradients exist, 
as they do for most natural scenes, and 
consider how we can compute the velocity 
field using simple resistive networks. 

Aperture problem. Let us derive an 
equation relating the change in image 
brightness to the motion of the image. 2 
We denote the image at time t by I(x,y,t). 
Let us assume that the brightness of the 
image is constant over time: 



This will be true, for instance, if a rigid 
object translates in space (assuming ortho¬ 
graphic projection), but not if it rotates. 
On the basis of the chain rule of differen¬ 
tiation, Equation 1 transforms into 

dldx dldy 37 = 

dxdt dy dt dt (2) 

I x u + I y v + I, = V7*v + I, = 0 

where we define the velocity v as ( u, v) = 
(i dx/dt , dy/dt), and where I x , I y , and 7, are 
the partial derivatives of the brightness 7 
with respect to x, y, and t. Because we 
assume that we can compute these spatial 
and temporal image gradients, we now 
have a single linear equation in two 
unknowns, u and v, the two components 
of the velocity vector. 

In other words, this equation by itself is 
not sufficient to determine the velocity 
field. Figure 1 graphically illustrates this 
aperture problem. Any measuring system 
with a finite aperture, whether biological 
or artificial, can only sense the velocity 
component perpendicular to the edge or 
along the spatial gradient (-7,/|V/|). The 
component of motion perpendicular to the 
gradient cannot, in principle, be regis¬ 
tered. The problem remains unchanged 
even if we measure these velocity compo¬ 
nents at many points throughout the 
image. For each measurement, we recover 
one equation with two unknowns. 

Smoothness assumption. Formally, this 
problem can be characterized as ill- 
posed. 3 Hadamard introduced this con¬ 
cept to describe problems in mathematical 
physics that (1) have no solution at all, (2) 
have no unique solution, or (3) do not 
depend continuously on the initial data. 
Inverse problems, such as computer 
tomography, represent ill-posed prob¬ 
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Figure 1. The aperture problem of motion. Any system with finite aperture, 
whether of biological or artificial origin, can only measure the velocity component 
-7 r /|V/| along the spatial gradient V7. Motion perpendicular to the gradient will 
not be visible, except by tracking salient features in the image. 1 


lems. All problems in early vision, which 
we define as the set of processes that 
recover the properties of the visible three- 
dimensional surfaces from the two- 
dimensional intensity arrays on retinae or 
cameras, are ill-posed. For example, 
binocular stereo and interpolating surfaces 
from sparse and noisy data are ill-posed, 
because in the former many and in the lat¬ 
ter infinitely many solutions exist. 

How can we make these problems well- 
posed, with unique solutions depending 
continuously on the data? One method of 
“regularizing” ill-posed problems 
involves restricting the class of admissible 
solutions by imposing appropriate con¬ 
straints. 3 Applying this method to 


motion, we argue that, in general, objects 
are smooth—except at isolated 
discontinuities—undergoing smooth 
movements. Thus, in general, neighboring 
points in the world will have similar veloc¬ 
ities. The projected velocity field should 
reflect this fact. We therefore impose on 
the velocity field the constraint that it 
should be the smoothest (in a given sense) 
while satisfying the data. As the measure 
of smoothness we choose the square of the 
velocity field gradient: 



A variational functional provides the most 
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Figure 2. Rectangular grid and resistive network, (a) Rectangular grid for solving 
the discrete version of Equation 4. (b) Part of the resistive network minimizing the 
discrete approximation of the energy function E in Equation 4. We assume the con¬ 
ductance T connecting neighboring nodes to be constant. Each node connects to a 
variable battery Ey via a conductance gy. Parasitic capacities (on the order of 0.1 
picofarad) give the circuit its dynamic behavior. The final network consists of two 
such resistive networks superimposed, where corresponding nodes are connected 
via a variable conductance T c _ y, as in Figure 3b. Once the batteries and conduc¬ 
tances gy and gy have been set, the network will converge—following Kirchhoff’s 
laws—to the state of least power dissipation that corresponds to the solution of the 
variational Equation 4. 


general way to formulate the problem. 2 
The final velocity field (m,v) minimizes 

E(u,v) = ff(I x u + I y v + I,) 2 + 

where the regularization parameter A is 
inversely dependent on the signal-to-noise 
ratio. The first term describes the fact that 
the final solution should follow as closely 
as possible the measured data, whereas the 
second term imposes the smoothness con¬ 
straint on the solution. The degree of 
minimization of one or the other term is 
governed by A. With accurate data, violat¬ 
ing the first term should be “expensive” 
and A will be small. Conversely, with 
unreliable data (low signal-to-noise ratio), 
much more emphasis will be placed on the 
smoothness term. Horn and Schunck 2 
first formulated this variational approach 
to the motion problem. 

The energy E(u, v) is quadratic in the 


unknown u and v. It then follows from 
standard calculus of variation that the 
associated Euler-Lagrange equations will 
be linear in u and v: 

I 2 u + I x IyV - AV 2 m + IJ, = 0 (5) 

I x IyU + I 2 v - AV 2 v + 1,1, = 0 

We now have two linear equations at every 
point. Our problem is therefore com¬ 
pletely determined. We could now use a 
number of iterative techniques, such as 
steepest descent, to solve these equations. 
Instead, we pursue a different path. 

Analog resistive networks. Let us 
assume that we are formulating Equations 
4 and 5 on a discrete two-dimensional grid, 
such as the one shown in Figure 2a. Equa¬ 
tion 5 then transforms into 

IxijUij + IxijlyijVij ~ l&Mi + Ij + My +1 

- 4My + M, _ ij + Ujj - ,) 

+ W,ii = o 


Ixijlyij u ij + Iyij v ij ~ A(v, + \j + 

Vy + i - 4Vy + V; _ u + Vy _ ,) 

+ Iyijhij = 0 

where we replaced the Laplacian with its 
five-point approximation on a rectangular 
grid. We now show that this set of linear 
equations can be solved naturally using a 
simple resistive network. Let us apply 
Kirchhoff’s current law to the center node 
of the resistive network shown in Figure 
2b. We then have the following update 
equation: 

dUjj _ 

c ~dt = r(M,+1 ' + U ‘ j+ ' ~ 4u ‘ J + 

11,-ij + Uy-J + i$(E u - u v ) (7) 

Let us now assume that we have two such 
resistive networks superimposed, with the 
node ij in the upmost network 
connected—via a conductance T c _y—to 
the appropriate node ij in the bottom net¬ 
work (see Figure 3b). We then have two 
equations similar to Equation 7 with a 
coupling term 7' c _y(vy - My), where Vy is 
the voltage at node ij in the bottom net¬ 
work. If we assume that the resistive net¬ 
work has converged to its final state, 
duy/dt = 0 and dvy/dt = 0, we see that 
both equations are identical with Equation 
6, if we identify 

T —►A 

T C -ij -► - I xi jl yi j 

gy —► Ixijilxij + lyij) 

gy lyijilxij + />y) (8) 

- I, 

U hij + lyij 

Once we set the batteries and the con¬ 
ductances to the values indicated in Equa¬ 
tion 8, the network will settle—following 
Kirchhoff’s laws—into the state of least 
power dissipation. The associated station¬ 
ary voltages correspond to the solution 
sought: My is equivalent to the x compo¬ 
nent and Vy to the y component of the 
optical flow field. A unique and stable 
solution always exists, even if some of the 
conductances have negative values. In 
fact, many of the conductances connect¬ 
ing the lower and the upper networks 
(T c _y) and the conductances associated 
with the batteries (gy and gy) will be nega¬ 
tive, because the sign of I x and I y can be 
either negative or positive. As we will see, 
this poses no serious problems, given the 
technology we have chosen to build 
resistances. (See the sidebar “Parallel 
computer implementation.”) 
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Figure 3. Rectangular grid with line processes and hybrid network, (a) The location of the horizontal (/£•) and vertical (/£) line 
processes relative to the rectangular motion-field grid, (b) The hybrid resistive network, computing the optical flow in the pres¬ 
ence of discontinuities. The conductances T c .y, which connect both grids, depend on the brightness gradient, as do the con¬ 
ductances gy and gy, which connect each node with the battery. For clarity, only two such elements are shown. The battery Ey 
depends on both the temporal and the spatial gradient and is zero if no brightness change occurs. The x component of the 
velocity, u, is given by the voltage in the top network, while the y component of the velocity, v, is given by the voltage in the 
bottom network. A high voltage value at location i,j will spread to its four neighboring nodes. The degree to which voltage 
spreads depends on the value of the fixed conductance, T, given by the inverse of the signal-to-noise ratio. Binary switches, 
which make or break the resistive connections between nodes, implement motion discontinuities, because an arbitrary high 
voltage (velocity) will not affect the neighboring site across the discontinuity. An extended horizontal motion discontinuity is 
indicated. These switches could be under the control of distributed digital processors. Analog CMOS implementations of the 
line processes also are feasible. 4 


The sequences in Figures 4, 5, and 6 
illustrate the resulting optical flow for syn¬ 
thetic and natural images. Figure 4c illus¬ 
trates the initial velocity data and the 
velocity component perpendicular to the 
image gradient. Figure 4d shows the result¬ 
ing smooth optical flow. As discussed by 
Horn and Schunck, 2 the smoothness con¬ 
straint leads to a qualitatively correct esti¬ 
mate of the velocity field. Thus, one 
undifferentiated blob appears to move to 
the lower right and one blob to the upper 
left. However, at the occluding edge where 
both squares overlap, the smoothness 
assumption results in a spatial average of 
the two opposing velocities, and the esti¬ 
mated velocity is very small or zero. 


Parallel computer implementation 


We simulated the behavior of 
these networks for both synthetic 
and natural images by solving the 
previous circuit equations at each 
node. As boundary conditions, we 
copied the initial velocity data at the 
edge of the image into the nodes 
lying directly adjacent to but outside 
the image (zero normal derivative). 
We estimated the spatial and tem¬ 
poral derivatives /„, l y , and /, using a 
discrete eight-point approximation. 

Given the high computational cost 
associated with solving these ellipti¬ 
cal equations, we used parallel com¬ 


puters of the Hypercube family: the 
32-node Mark III Hypercube at the Jet 
Propulsion Laboratory and a 4- and a 
16-node Ncube in the laboratory at 
Caltech. Even though we used a vari¬ 
able time-step algorithm, conver¬ 
gence times were slow (10 minutes 
fora 128x128 image). Solving Equa¬ 
tion 4 is similar to solving Poisson’s 
equation. Thus, the number of itera¬ 
tions required to converge is propor¬ 
tional to n 2 (on an n x n-pixel image). 
A multigrid approach will greatly 
speed up the performance. 
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Figure 4. Motion sequence using synthetic data, (a) and (b) Two 32 x 32-pixel images of three high-contrast squares on a 
homogeneous white background. Only the two squares on the upper left are displaced, (c) The initial velocity data. The insides 
of both squares contain no data, (d) The final state of the network after 240 iterations, corresponding to the smooth optical 
flow field. The algorithm gives a qualitatively correct estimate of the velocity field. Note, however, the vanishing velocity estimate 
at the occluding edges where the two moving squares overlap, caused by the averaging property of the smoothness constraint. 
Moreover, the moving objects are not delineated in the flow field, because the algorithm smooths over the figure-ground 
motion discontinuity, (e) Optical flow in the presence of motion discontinuities (indicated by solid lines). Numerous line 
processes are turned on in the area where the moving objects overlap. The formation of discontinuities along continuous con¬ 
tours is explicitly encouraged, (f) Discontinuities are strongly encouraged to form at the location of intensity edges. 5 This addi¬ 
tional constraint leads to the correct velocity field. The location of these discontinuities facilitates object segmentation at a 
later stage of visual analysis. Both (e) and (f) show the state of the hybrid network after six analog-digital cycles. 


In parts of the image where the bright¬ 
ness gradient is zero and thus no initial 
velocity data exist (for instance, in the 
interiors of the two squares), the velocity 
estimate is simply the spatial average of the 
neighboring velocity estimates. These 
empty areas eventually will fill in from the 
boundary, similar to the flow of heat for 
a uniform flat plate with “hot” 
boundaries. 

The sequence in Figure 5 also illustrates 


the effect of varying the conductance T 
between neighboring points. As we place 
more confidence in the measured data 
(small A), the coupling between neighbor¬ 
ing nodes decreases because T decreases, 
and the optical flow becomes more 
inhomogeneous, better reflecting the cor¬ 
rect velocity field. As the data becomes less 
reliable (large A), more smoothing occurs 
until little spatial variation exists (see Fig¬ 
ure 3). 


Motion discontinuities. The smoothness 
assumption of Horn and Schunck 2 
regularizes the aperture problem and leads 
to the qualitatively correct velocity field 
inside moving objects. However, this 
approach fails to detect the locations at 
which the velocity changes abruptly or dis- 
continuously. Thus, this strategy smooths 
over the figure-ground discontinuity or 
completely fails to detect the boundary 
between two objects with differing veloc- 
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Figure 5. Optical flow of a moving hand, (a) and (b) Two 128 x 128-pixel images 
captured by a video camera. The hand is displaced downward by up to two pixels, 
(c) Zero-crossings of the Laplacian of a Gaussian (with seven-pixel-wide center 
lines) superimposed on the initial velocity data. The zero-crossings are thresholded 
to remove noise. In areas with little or no spatial gradients, amplified image noise 
leads to noisy velocity data, since their amplitude is given by -/,/|V/|. The zero- 
crossings of both images are shown, thus the double line. Notice the stationary 
zero-crossing at the right edge, (d) The smooth optical flow after 1000 iterations. In 
these and the following images, the plotted individual velocity vectors are not 
highly visible; however, the gray-scale intensity is proportional to the magnitude of 
the velocity (the direction is always downward). The smooth optical flow for a five- 
times-lower and five-times-higher value of the conductance T appears in (e) and (f). 
The next four images show the state of the hybrid network after the first (g), sec¬ 
ond (h), fifth (i), and ninth (j) analog-digital cycles. In the final image, the fingers 
have a higher velocity than does the hand itself. It takes several cycles for the 
motion discontinuities to “creep” around the outline of the hand. 
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Figure 6. Optical flow of a moving person, (a) and (b) Two 128 x 128 pixel images captured by a video camera. The person in 
the foreground is moving toward the right while the person in the background is stationary. The noise in the lower part of the 
image is a camera artifact, (c) Thresholded zero-crossings superimposed on the initial velocity data, (d) The smooth optical 
flow after 1000 iterations. Note that the noise in the lower part of both video images is completely smoothed away, (e) The 
final piecewise smooth optical flow after 13 analog-digital cycles. The velocity field is subsampled to improve visibility. With 
the exception of a square appendage at the right hip, the optical flow field shown corresponds to the correct velocity field. The 
appendage, caused by the edge-detection scheme lumping part of the garbage can in the background with the contour of the 
person, represents an instance of what psychophysicists term motion capture. More recently, we have successfully computed 
the optical flow field for images with many, partially occluding, moving people. 


ities, because the algorithm combines 
velocity information across motion 
boundaries. We argue that motion discon¬ 
tinuities are the most interesting locations 
in any image, because they indicate where 
one object ends and another one begins. 
Motion as well as intensity discontinuities 
are vital for solving the critical object seg¬ 
mentation problem that occurs at a subse¬ 
quent stage of the image understanding 
process. 

Various researchers have attempted to 
prevent the smoothing constraint from 
taking effect across strong velocity gra¬ 
dients. 6 Geman and Geman 7 proposed a 
successful strategy for dealing with discon¬ 


tinuities. They exploited an analogy 
between statistical mechanics and images, 
whereby the intensity values at each pic¬ 
ture element and the presence of discon¬ 
tinuities are viewed as states of particles on 
a lattice. We can assign an “energy” func¬ 
tion to this system and compute its most 
likely state. 

In this article, we do not rigorously 
develop this approach, based on Bayesian 
estimation theory. 7,8 Suffice it to say that 
a priori knowledge (for instance, that the 
velocity field in general should be smooth) 
can be formulated in terms of a Markov 
random field model of the image. (In a 
Markov random field, the conditional 


probability that a given variable at location 
i,j has a particular value fj depends only 
on the values of/in a neighborhood of ij.) 
Given such an image model, and given 
noisy data, we then estimate the “best” 
flow field by some likelihood criterion. 
The one we use here is the maximum a 
posteriori estimate, although other possi¬ 
ble criteria have certain advantages. 8 
Maximizing the a posteriori probability 
yields the best solution. We can show this 
to be fully equivalent to minimizing an 
expression such as Equation 4. 

To reconstruct images consisting of 
piecewise constant segments, Geman and 
Geman 7 further introduced the powerful 
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idea of a line process / (see also Blake and 
Zisserman 9 ). For our purposes, we 
assume that a line process can occupy one 
of two states: “on” (/ = 1) or “off” (/=0). 
Line discontinuities are located on a regu¬ 
lar lattice set between the original pixel lat¬ 
tice (see Figure 3a), such that each pixel ij 
has one horizontal ly and one vertical fe¬ 
line process associated with it. If the 
appropriate line process is turned on, the 
smoothness term between the two adjacent 
pixels will be set to zero. 

To prevent line processes from forming 
everywhere and to incorporate additional 
knowledge regarding discontinuities into 
the line processes, we must include an 
additional term V c (l) in the new energy 
function: 

E(U,V,l h ,r) = ZdxUiJ + IyVy + I,) 2 

+ A £(1 - 4 ) [(«,+ !, - h </) 2 + 

(»,;„- v v) 2 l ( 9 ) 

+ AJT(1 - ID [(«(,+ !- Uy) 2 + 

(Vy +j - Vy) 2 ] + V c (l) 

V c contains a number of terms penaliz¬ 
ing or encouraging specific configurations 
of line processes: 

do, 

(/£■;,+ iU 2 ) + c,vad 

plus the corresponding expression for the 
vertical line process ly (obtained by inter¬ 
changing i withy and ly with ly). The first 
term (C c ) penalizes each introduction of a 
line process, because the cost C c has to be 
“paid” every time a line process is turned 
on. The second term prevents the forma¬ 
tion of parallel lines. If either ly + 1 or ly +2 
is turned on, this term will tend to prevent 
iy from turning on. The third term ( C ,) 
embodies the fact that, in general, motion 
discontinuities occur along extended con¬ 
tours and rarely intersect. We adopt the 
function given by Koch et al. 10 favoring 
the formation of motion discontinuities 
along extended contours and penalizing 
both multiple line intersections and iso¬ 
lated discontinuities. 

We obtain the optical flow by minimiz¬ 
ing the cost function in Equation 9 with 
respect to both the velocity field v = (m, v) 
and the line processes l h and f. However, 
unlike before* this cost or energy function 
is nonconvex, since it contains cubic and 
possibly higher terms (in V c ). Geman and 
Geman resorted to annealing, a statistical 
optimization technique, to find the ground 
state of their system. If annealing is 
applied appropriately, the system con¬ 


verges with probability converging to one 
to the global maximum. 7 However, the 
length of the required convergence times 
makes any practical application expensive. 

To find an optimal solution to this non¬ 
quadratic minimization problem, we fol¬ 
low the approach used by Koch et al. 10 
and use a purely deterministic algorithm, 
based on solving Kirchhoff’s equations for 
a mixed analog and digital network. 8,11 
Our algorithm exploits the fact that for a 
fixed distribution of line processes, the 
energy function of Equation 9 is quad¬ 
ratic. Thus, we first initialize the analog 
resistive network (see Figure 3b) according 
to Equation 8 and with no line processes 
on. The network then converges to the 
smoothest solution. Subsequently, we 
update the line processes by deciding at 


each site of the line process lattice whether 
the overall energy can be lowered by set¬ 
ting or breaking the line process. We 
always accept the state of the line process 
corresponding to the lower energy config¬ 
uration: ly will be turned on if E(u, v, ft = 
1,0 < E(u,v,ly = 0,0; otherwise, ly = 
0. This computation requires only local 
information. Breaking the appropriate 
resistive connection between the two 
neighboring nodes switches on the line 
processes. After the completion of one 
such analog-digital cycle, we reiterate and 
compute the smoothest state of the analog 
network for the newly updated distribu¬ 
tion of line processes. 

Although we have no guarantee that the 
system will converge to the global mini¬ 
mum, given our use of a gradient descent 


Restricting motion discontinuities to edges 


As edges we use the zero-crossings 
of a Laplacian of a Gaussian con¬ 
volved with the original image. Marr 
and Hildreth 12 have shown that these 
locations usually correspond to 
physical edges. We threshold these 
zero-crossings (based on the square 
of the gradient) in order to remove 
spurious zero-crossings caused by 
noise and “weak” edges. Other edge 
detection algorithms should work 
equally well. 

We now add a new term V Z _ C(; to 
our energy function E, such that 
V z _ c ,. is zero if /, ( is off or if ly is on 


and a zero-crossing exists between 
locations /' and /. If / /7 = 1 in the 
absence of a zero-crossing, V z _ Ctf is 
set to a large positive number (in our 
case, 1000). 

This strategy effectively prevents 
motion discontinuities from forming 
at locations where no zero-crossings 
exist, unless strongly suggested by 
the data. Conversely, however, zero- 
crossings by themselves will not 
induce the formation of discontinui¬ 
ties in the absence of motion gra¬ 
dients. 


Varying the ‘amplitude’ of motion 
discontinuities 


When dealing with real data, the 
amplitude of velocity and, consequent¬ 
ly, the amplitude of any motion dis¬ 
continuity vary over a considerable 
range (as compared to the artificial 
situation in Figure 4). Our strategy in 
dealing with this problem involves 
varying the magnitude of the V c term 
in Equation 9 by multiplying V 0 with 
1/K(f). 10 Initially, K(t) is small, but it 
then increases linearly until a given 
upper bound. 

In other words, the formation of 
discontinuities is penalized initially, 


encouraging a smooth interpolation 
everywhere except at very steep 
velocity gradients. Subsequently, by 
paying a smaller price for the forma¬ 
tion of line processes, the optical 
flow will break at smaller velocity 
gradients. The final state of the net¬ 
work is independent of the speed at 
which K(t) changes (adiabatic con¬ 
vergence). All other parameters 
remain constant and are identical for 
all simulations reported in this 
article. 
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Figure 7. Basic resistance element in analog subthreshold CMOS technology, (a) 
Shows a one-dimensional cut through the resistive network of Figures 2 and 8. The 
active circuits—built out of nine transistors—within the shaded areas implement a 
variable nonlinear resistance. Each transconductance amplifier implements the con¬ 
ductance G (see Figure 8), whose value can be set by V conf . The spatial response of 
such a network to a point voltage stimulus applied to the left-hand side is shown in 
(c). In an ideal tap line, the measured voltage values (points) should follow an 
exponential decay (lines). V T — V co „f sets the decay length L. The current-voltage 
characteristic of one such resistive element is illustrated in (b). The voltage V T con¬ 
trols the maximum current and thus the slope of the resistance, which can vary 
between 100 kQ and 10 Gfi. Many variations are possible. 4 


rule, the system seems to find next-to- 
optimal solutions (see Figures 4,5, and 6) 
in about 10 to 15 analog-digital cycles. 
Furthermore, the algorithm must con¬ 
verge, because at each step the energy E is 
always reduced and E is bound from 
below. We compared statistical annealing 
with our deterministic method in the case 
of interpolating and smoothing sparsely 


sampled data in the presence of discon¬ 
tinuity, where the underlying energy func¬ 
tion is similar to E in Equation 9. Both 
methods converged to qualitatively simi¬ 
lar solutions. 10 

The synthetic motion sequence in Figure 
4 demonstrates the dramatic effect of the 
line processes. The optical flow outside the 
discontinuities approximately delineating 


the boundaries of the moving squares is 
zero, as it should be (see Figure 4e). Where 
the two squares overlap, however, the 
velocity gradient is high and multiple inter¬ 
secting discontinuities exist. 

To restrict further the location of dis¬ 
continuities, we adopt a technique used by 
Gamble and Poggio 5 to locate depth dis¬ 
continuities by requiring that depth dis¬ 
continuities coincide with the location of 
intensity edges. In general, the physical 
processes and the geometry of the three- 
dimensional scene giving rise to the motion 
discontinuity will also give rise to an inten¬ 
sity edge. For example, moving physical 
objects occluding other objects will give 
rise to an image with edges at the occlud¬ 
ing boundaries. In fact, only under labora¬ 
tory conditions—for instance, using 
random dot patterns—does a motion dis¬ 
continuity not coincide with intensity 
edges. 

Figure 4f demonstrates that this strategy 
leads to the correct velocity field—with the 
exception of the corners—in addition to 
labeling all motion discontinuities. Figures 
5 and 6 demonstrate our method on image 
pairs obtained with a video camera. See 
also the sidebars “Restricting motion dis¬ 
continuities to edges” and “Varying the 
‘amplitude’ of motion discontinuities.” 

Analog VLSI networks 

Even with the approximations and 
optimizations we previously described, the 
computations involved in this and similar 
early vision tasks require tens of minutes 
to hours on a large-scale parallel computer 
(see, however, Gamble and Poggio 5 ). For 
the computations to be truly useful, we 
should be able to carry them out on a 
whole image in real time. Fortunately, 
modern integrated circuit technology gives 
us a medium in which we can realize 
extremely complex, analog real-time 
implementations of these computational 
metaphors. 4 

We can achieve a compact implementa¬ 
tion of a resistive network using an ordi¬ 
nary CMOS process, provided the 
transistors run in the subthreshold range, 
where their characterstics are ideal for 
implementing low-current analog func¬ 
tions. We achieve the effect of a resistor by 
choosing the circuit configuration shown 
in Figure 7 rather than by using the resis¬ 
tance of a special layer in the process. We 
can control the value of the resulting resis¬ 
tance over five orders of magnitude by 
setting the bias voltages on the upper and 
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Figure 8. Circuit design for a resistive network for interpolating and smoothing 
noisy and sparsely sampled depth measurements. 10 The basic version of this CMOS 
circuit contains 20 x 20 grid points on a hexagonal lattice. The individual resistive 
elements with a variable slope controlled by V T , shown in Figure 7, correspond to 
the term governing the smoothness, A. At those locations where a depth measure¬ 
ment dy is present, the battery is set to this value ( V in = dij) and the value of the 
conductance G is set to l/(2o 2 ), where o 2 is the variance of the Gaussian noise pro¬ 
cess associated with the depth measurements. If no depth data are present at that 
node, G is set to zero. The voltage at each node corresponds to the discrete values 
of the smoothed surface fitted through the noisy and sparse measurements. 8,10 The 
network for computing smooth optical flow minimizing E in Equation 4 via the 
network shown in Figure 3b is similar to this circuit. 


lower current source transistors. The 
current-voltage curve saturates above 
approximately 100 millivolts, a feature 
that we can use to advantage in many 
applications. 

With small voltage gradients, we can 
treat the circuit as if it were a linear resis¬ 
tor, as shown by the shaded areas on the 
curves (Figure 7b). Conductances to signal 
input sources are implemented with trans¬ 
conductance amplifier followers, as 
shown in Figure 7a. Each amplifier injects 
a current into the network proportional to 
the difference between the local signal 
potential and the potential of the network. 
The effect of a conductance is thus 
achieved without drawing any current 
from the signal source. The value of the 
conductance is set by the transconductance 
control on the amplifier, which we can use 
to reflect the confidence assigned to the 
particular input. High conductance values 
give the network a short spatial-averaging 
scale, low values give a long averaging 
scale. 

Figure 7c shows the spatial response of 
an experimental one-dimensional network 
to a point stimulus. We obtained the 
different values of averaging length L by 
appropriate settings of the amplifier trans¬ 
conductances. We can easily realize 
resistances with an effective negative resis¬ 
tance value. 

Figure 8 shows the ideal configuration 
for a network implementation in two 
dimensions. Each point on the hexagonal 
grid is coupled to six equivalent neighbors. 
The high degree of symmetry of such an 
arrangement creates a nearly isotropic 
environment, free of many of the “pre¬ 
ferred axis artifacts” introduced by an 
orthogonal grid. In addition, the larger 
connectivity allows a greater variation in 
effective resistor value caused by varia¬ 
tions in transistor parameters. 

Figure 9 shows a test chip implementing 
this network. Each node includes the resis¬ 
tor apparatus and a set of sample-and-hold 
circuits for setting the confidence and sig¬ 
nal input voltages. In addition, an output 
amplifier enables measurement of the 
node voltage without disturbing the node 
itself. A scanning mechanism addresses 
both the sample-and-hold circuits and the 
output buffer, so the stored variables can 
be refreshed or updated, and the map of 
node voltages can be read out in real time. 

A 48 x 48 silicon retina has been con¬ 
structed that uses the hexagonal network 
of Figure 8 as a model for the horizontal 
cell layer in the vertebrate retina. 13 In this 
application, the input potentials were the 


outputs of logarithmic photoreceptors— 
implemented via phototransistors—and 
the potential difference across the conduc¬ 
tance T formed an excellent approxima¬ 
tion to the Laplacian operator. 12 This 
model results in the classical center- 
surround receptive field properties 
observed in the response of retinal gan¬ 
glion cells. The circuit performs in real 
time. 

W e have demonstrated that the 
introduction of binary motion 
discontinuities into Horn and 
Schunck’s 2 algorithm leads to a much 


improved performance of their method, 
particularly for the optical flow in the pres¬ 
ence of a number of moving objects. 
Moreover, we have shown that the appro¬ 
priate computations map onto simple 
resistive networks. 

We are now implementing these resistive 
networks in VLSI circuits, using sub¬ 
threshold CMOS technology. Many prob¬ 
lems in early vision can be formulated in 
terms of similar nonconvex energy func¬ 
tions that need to be minimized, such as 
binocular stereo, edge detection, surface 
interpolation, and structure from 
motion. 3,8 ' 11 A similar approach to early 
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Figure 9. Partial layout of the CMOS chip implementing the resistive network shown in Figure 8, which interpolates sparsely 
sampled noisy data. Only seven cells (out of the 20 x 20 cell array) are shown in order to demonstrate the hexagonal grid. Each 
cell is dominated by the two capacities (approximately two picofarads each) for holding the depth data and its associated confi¬ 
dence value, consists of 46 transistors, and measures 180 x 132 A 2 . For a A = 1.5 /mi production run, the total chip measures 
5.4 X 2.6 mm. If image acquisition-devices (phototransistors) are placed on the chip, the sample-and-hold circuitry can be 
eliminated, substantially reducing the area of the elementary cell. 13 


Fault tolerance 

Hutchinson and Koch 15 demon¬ 
strated the robustness of these resis¬ 
tive networks to component errors. In 
their circuit simulations of the resistive 
network for interpolating surfaces 
from noisy and sparsely sampled data 
(shown in Figures 2b and 8), they 
replaced each transversal conductance 
T by 7"(1 + N(o 2 )), where N is a zero- 
mean Gaussian probability distribution 
with variance o 2 . 

Due to the linearity of the network 
connections, errors “average out” 
and performance is only marginally 


impaired even for widely varying con¬ 
ductances (o 2 = 0.5). If nodes are 
pulled accidentally to ground, the 
line processes in their immediate 
neighborhood turn on because of the 
high voltage gradient, isolating these 
nodes and preventing error propaga¬ 
tion. The saturation characteristic of 
our resistive elements outside their 
linear range (see Figure 7b) serves to 
prevent high current flows, because 
high voltage gradients between 
neighboring nodes induce only a 
constant maximal current. 4 


vision—using the fine-grained, mesh-type, 
single-instruction, multiple-data parallel 
Connection Machine instead of resistive 
networks—is being pioneered at MIT’s 
Artificial Intelligence Laboratory in the 
Vision Machine project. 5 ' 14 

These networks share several features 
with biological neuronal networks. Specif¬ 
ically, they do not require a system-wide 
clock, they rely on many connections 
between simple computational nodes, they 
converge rapidly (within several time cons¬ 
tants), and they are quite robust to hard¬ 
ware errors. (See the sidebar “Fault 
tolerance.”) 

Our networks consume moderate 
amounts of power, because each resistive 
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element operates in the millivolt and 10 
nanoampere range. The entire retina 
chip 13 requires about 100 jtW (the domi¬ 
nant power consumption lies in the pho¬ 
toconversion stage). 

These features—real-time performance, 
low power consumption, robustness, and 
small spatial dimensions—make these cir¬ 
cuits attractive for a variety of deep space 
missions. In collaboration with the Jet 
Propulsion Laboratory, we are currently 
evaluating the feasibility of such resistive 
network-based vision systems for autono¬ 
mous vehicles to be used in the exploration 
of planetary surfaces, such as that of 
Mars. □ 
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A Neural Network for Visual 
Pattern Recognition 

Kunihiko Fukushima 

NHK Science and Technical Research Laboratories 


V isual pattern recognition such as 
reading characters or distinguish¬ 
ing shapes—a task easily accom¬ 
plished by human beings—presents 
significant difficulties to those attempting 
to design information processors that can 
do the same thing. The solution to this 
design dilemma apparently resides in the 
brain itself. 

The human brain has more than 10 bil¬ 
lion neural cells, which have complicated 
interconnections and constitute a large- 
scale network. Hence, uncovering the neu¬ 
ral mechanisms of the higher functions of 
the brain is not easy. In the conventional 
neurophysiological approach, for exam¬ 
ple, a microelectrode is used to record the 
response of these cells. However, the 
recording can be made from, at most, a 
few cells simultaneously. Although we can 
obtain fragmentary knowledge thus, 
understanding the mechanism of the net¬ 
work as a whole proves difficult. 

A modeling approach, which is a syn¬ 
thetic approach using neural network 
models, therefore continues to gain impor¬ 
tance. In the modeling approach, we study 
how to interconnect neurons to synthesize 
a brain model, which is a network with the 
same functions and abilities as the brain. 

When synthesizing a model, we try to 
follow physiological evidence as faithfully 
as possible. For parts not yet clear, how¬ 
ever, we construct a hypothesis and syn¬ 
thesize a model that follows the 
hypothesis. We then analyze or simulate 
the behavior of the model and compare it 


This model of the 
neural network offers 
insight into the brain’s 
complex mechanisms 
as well as design 
principles for new 
information 
processors. 


with that of the brain. If we find any dis¬ 
crepancy in the behavior between the 
model and the brain, we change the initial 
hypothesis and modify the model follow¬ 
ing a new hypothesis. We then test the 
behavior of the model again. We repeat 
this procedure until the model behaves in 
the same way as the brain. Although we 
must still verify the validity of the model 
by physiological experiment, it is probable 
that the brain uses the same mechanism as 
the model, because both respond in the 
same way. Hence, modeling neural net¬ 
works promises to help us uncover the 
mechanism of the brain. 

The relationship between modeling neu¬ 


ral networks and neurophysiology resem¬ 
bles that between theoretical physics and 
experimental physics. Modeling takes a 
synthetic approach, while neurophysiol¬ 
ogy or psychology takes an analytical 
approach. 

Once we complete a model, its simplifi¬ 
cation makes it easy to see the essential 
algorithm of information processing in the 
brain. We can use the algorithm directly as 
a design principle for new information 
processors. 

Modeling neural networks is useful in 
explaining the brain and also in engineer¬ 
ing applications. It brings the results of 
neurophysiological and psychological 
research to engineering applications in the 
most direct way possible. 

This article discusses a neural network 
model thus obtained, a model with selec¬ 
tive attention in visual pattern recog¬ 
nition. 1,2 

Researchers have reported various 
models capable of visual pattern recogni¬ 
tion, models that have the function of self¬ 
organization and can learn to recognize 
patterns. Many are hierarchical networks 
consisting of layers of neuron-like cells. 
The ability to process information 
increases in proportion to the number of 
layers in the network. Various studies have 
attempted to find effective learning proce¬ 
dures for the self-organization of mul¬ 
tilayered networks. An example of 
learning-with-a-teacher (supervised learn¬ 
ing) useful for training multilayered net¬ 
works is the back-propagating errors 
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Figure 1. Input-to-output characteristics of a u s cell: a typical example of the cells 
employed in the neural network model. 7 The strength of the input connections (or 
weights), a(l), a(2),... , and b, is variable and reinforced during the process of self¬ 
organization of the network. The excitatory effect e, which is the weighted sum of 
all the excitatory inputs, is suppressed by the inhibitory effect h in a shunting man¬ 
ner. When the inhibition is stronger than the excitation, the output of the cell 
becomes zero. Some of the cells in the network have fixed input connections 
formed from the beginning and not variable. In some kinds of cells, a nonlinear 
summation of excitatory inputs is performed. 


procedure. 3 The cognitron 4 model, which 
I proposed in 1975, uses learning-without- 
a-teacher (unsupervised learning). The 
procedure used in the cognitron has been 
classified under the competitive-learning 
paradigm. 5 

The cognitron, like many other models, 
does not have the ability to correctly recog¬ 
nize position-shifted or shape-distorted 
patterns. The conventional cognitron 
usually recognizes the same pattern 
presented at a different position as a 
different pattern. I proposed the 
neocognitron 6,7 to eliminate this defect. 
The neocognitron has the ability to recog¬ 
nize stimulus patterns correctly, even if the 
patterns are shifted in position or distorted 
in shape. 

When two or more patterns are 
presented simultaneously, however, the 
neocognitron does not always correctly 
recognize them. To improve the function 
of the neocognitron, backward connec¬ 
tions were added to the conventional 
neocognitron, which had only forward 
connections. The new model thus obtained 
acquired the function of selective attention 


in visual pattern recognition. This model, 
discussed in this article, can automatically 
segment and recognize individual patterns 
presented simultaneously. The model can 
also restore imperfect patterns and elimi¬ 
nate noise from contaminated patterns. 

Although various models of associative 
memory have reportedly been able to recall 
complete patterns from imperfect ones, 8 
most do not work well unless the stimulus 
pattern is identical in size, shape, and posi¬ 
tion to a training pattern. In contrast to 
such earlier models, the new model works 
well even for deformed stimulus patterns, 
regardless of their position. 

Physiology 

In the visual area of the cerebrum, neu¬ 
rons respond selectively to local features 
of a visual pattern, such as lines and edges 
in particular orientations. 9 In the area 
higher than the visual cortex, cells exist 
that respond selectively to certain figures 
like circles, triangles, squares, or even 
human faces. 10 Accordingly, the visual 


system seems to have a hierarchical struc¬ 
ture, in which simple features are first 
extracted from a stimulus pattern, then 
integrated into more complicated ones. In 
this hierarchy, a cell in a higher stage 
generally receives signals from a wider area 
of the retina and is more insensitive to the 
position of the stimulus. 

Within the hierarchical structure of the 
visual system are forward (afferent, or 
bottom-up) and backward (efferent, or 
top-down) signal flows. For example, ana¬ 
tomical observations show that the major 
visual areas of the cerebrum interconnect 
in a precise topographical and reciprocal 
fashion. 

Such neural networks in the brain are 
not always complete at birth. They develop 
gradually, neurons extending branches 
and making connections with many other 
neurons, adapting flexibly to circum¬ 
stances after birth. 

This kind of physiological evidence sug¬ 
gests a network structure for the model. 


Outline of the model 

The model is a network consisting of 
neuron-like cells of the analog type; that 
is, their inputs and outputs take non¬ 
negative analog values, corresponding to 
the instantaneous firing frequencies of 
biological neurons. Figure 1 shows a typi¬ 
cal example of the cells employed in the 
network. 

The network has a hierarchical mul¬ 
tilayered structure consisting of a cascade 
of many layers of cells, as shown in Figure 
2. The network has forward and backward 
connections between cells. In this hierar¬ 
chy, the forward signals manage the func¬ 
tion of pattern recognition, while the 
backward signals manage the function of 
selective attention, pattern segmentation, 
and associative recall. 

Some of the connections between cells 
are variable, and the network can acquire 
the ability to recognize patterns by 
learning-without-a-teacher. The network 
can be trained to recognize any set of pat¬ 
terns. During the process of learning, var¬ 
iable connections grow gradually in 
accordance with the stimuli given the net¬ 
work. The repeated presentation of a set 
of training patterns is sufficient for the 
self-organization of the network; it does 
not need information about the categories 
into which these patterns should be clas¬ 
sified. 

When a composite figure consisting of 
two or more patterns is presented to the 


66 


COMPUTER 

















pattern recognition 
forward (afferent) 







TO 




backward (efferent) 
selective attention 


Figure 2. Hierarchical network structure and the signal flow in the network. 
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Figure 3. The forward and backward signals and their interaction in the hierarchi¬ 
cal network. 


model that has finished learning, the 
model selectively focuses its attention on 
one pattern after another, segments the 
pattern from the others, and recognizes it 
separately. Even if noise or defects affect 
the pattern, the model can recognize it and 
recall the complete pattern in which the 
noise has been eliminated and the defects 
corrected. Perfect recall does not require 
that the stimulus pattern be identical in 
shape to the training pattern the model 
learned. A pattern distorted in shape or 
changed in size can be correctly recognized 
and the missing portions restored. 

Figure 3 schematically illustrates the sig¬ 
nal flow in the network. In this diagram, 
layers of cells in the forward paths and the 
backward paths are drawn separately. 

A stimulus pattern is presented to the 
lowest stage of the forward paths, the 
input layer , which consists of a two- 
dimensional array of receptor cells. The 
highest stage of the forward paths is the 
recognition layer. After the process of 
learning ends, the final result of the pattern 
recognition shows in the response of the 
cells of the highest stage. In other words, 
cells of the recognition layer work as gnos¬ 
tic cells (or grandmother cells); usually one 
cell is activated, corresponding to the cat¬ 
egory of the specific stimulus pattern. Pat¬ 
tern recognition by the network occurs on 
the basis of similarity in shape among pat¬ 
terns, unaffected by deformation, changes 
in size, and shifts in the position of the 
input patterns. 

The output of the recognition layer is 
sent to lower stages through the backward 
paths. The forward and the backward sig¬ 
nals interact with each other in the hierar¬ 
chical network. The backward signals 
facilitate the forward signals and, at the 
same time, the forward signals gate the 
backward signal flow. * 

The result of associative recall appears 
in the lowest stage of the backward paths, 
the recall layer. We can also interpret the 
output of the recall layer as the result of 
segmentation. The response of the recall 
layer is fed back to the input layer. 

At each stage of the hierarchical net¬ 
work, several kinds of cells exist. Notation 
such as u s , u c , w s , and x c denotes cells, 
where letters u and w indicate the cells in 
the forward paths and backward paths, 
respectively. Figure 4 illustrates how the 
different cells, represented by circles, are 
connected to each other. Although the fig- 


"This process resembles that of the adaptive resonance 
theory" in the sense that the forward and backward 
signals interact with each other, but the method of 
interaction differs. 


ure shows only one of each kind of cell in 
each stage, numerous cells actually exist 
arranged in a two-dimensional array. For 
example, the notation u a denotes a u c cell 
in the /-th stage of the hierarchical net¬ 
work. The L-th stage is the highest stage. 
In Figure 4, L = 3. The notation U c , 
denotes the layer of u C / cells. In Figure 4, 
layer U C o is the input layer, layer U C 3 is 
the recognition layer, and layer Wco is the 
recall layer. 

Between the cells in Figure 4 are connec¬ 
tions denoted by single lines or by double 
lines. A single line indicates one-to-one 
connections between the two groups of 


cells; a double line indicates converging or 
diverging connections between them. 

A more detailed illustration of the spa¬ 
tial interconnections between neighboring 
cells appears in Figure 5. The functions of 
these cells will be considered next. 

Forward paths in the 
network 

If we consider the forward paths in the 
network only, the model has almost the 
same structure and function as the 
neocognitron 6,7 neural network model. In 
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Figure 4. Hierarchical structure of the interconnections between different kinds of cells. 2 
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Figure 5. Detailed diagram illustrating spatial interconnections between neighbor¬ 
ing cells. 


the forward paths of the network, layers 
of u s cells and u c cells are arranged alter¬ 
nately, as shown in Figures 4 and 5. 

Cells denoted by u s are feature- 
extracting cells. Connections converging 
to these cells are variable and reinforced by 
learning (or training). After finishing the 
learning, u s cells can extract features 
from the input pattern. In other words, a 
u s cell is activated only when a particular 
feature is presented at a certain position in 
the input layer. The features extracted by 
the u s cells are determined during the 
learning process. Generally speaking, local 
features, such as a line at a particular 
orientation, are extracted in the lower 


stages. More global features, such as part 
of a training pattern, are extracted in 
higher stages. The process of learning and 
the mechanism of feature extraction by u s 
cells are discussed below in “Self¬ 
organization of the network.” 

The u c cells are inserted in the network 
to allow for positional errors in the fea¬ 
tures of the stimulus. Connections from 
u s cells to u c cells are fixed and 
invariable. 

The lower part of Figure 6 shows a 
detailed structure between layers of u s 
cells and u c cells. Each layer of u s cells or 
u c cells is divided into subgroups accord¬ 
ing to the features to which they respond. 


The cells in each subgroup are arranged in 
a two-dimensional array. The connections 
converging to the cells in a subgroup are 
homogeneous: All the cells in a subgroup 
receive input connections of the same spa¬ 
tial distribution, where only the positions 
of the preceding cells shift in parallel with 
the position of the cells in the subgroup. 
This condition of homogeneity holds for 
fixed connections and for variable connec¬ 
tions. As discussed in “Self-organization 
of the network, ’ ’ the reinforcement of the 
variable connections is always performed 
under this condition. 

Each u c cell receives signals from a 
group of u s cells that extract the same fea¬ 
ture, but from slightly different positions. 
The u c cell is activated if at least one of 
these u s cells is active. Even if the stimu¬ 
lus feature shifts position and another u s 
cell is activated instead of the first one, the 
same u c cell keeps responding. Hence, the 
u c cell’s response is less sensitive to shifts 
in position of the input pattern. 

In the whole network, with its alternate 
layers of us cells and uc cells, the process 
of feature extraction by u s cells and toler¬ 
ation of positional shift by u c cells 
repeats. During this process, local features 
extracted in a lower stage are gradually 
integrated into more global features. Fig¬ 
ure 6 illustrates this situation. 

Finally, each u c cell of the recognition 
layer U CL at the highest stage (denoted by 
u C l) integrates all the information of the 
input pattern; each cell responds only to 
one specific pattern. In other words, only 
one u CL cell, corresponding to the cate¬ 
gory of the input pattern, is activated. 
Other cells respond to the patterns of other 
categories. 

Tolerating positional error a little at a 
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time at each stage, rather than all in one 
step, plays an important role in endowing 
the network with an ability to recognize 
even distorted patterns. Figure 7 illustrates 
this. Let a u s cell in an intermediate stage 
of the network have already been trained 
to extract a global feature consisting of 
three local features of a training pattern 
“A” as shown in Figure 7a. The cell toler¬ 
ates the positional error of each local fea¬ 
ture, if its deviation falls within the dotted 
circle. Hence, this u s cell responds to any 
of the deformed patterns shown in Figure 
7b. The toleration of positional errors 
should not be too large at this stage. If too 
large errors are tolerated at one step, the 
network may come to respond errone¬ 
ously, such as recognizing a stimulus like 
Figure 7c as an “A” pattern. 

Since errors in the relative position of 
local features are tolerated in the process 
of extracting and integrating features, the 
same u C l cell responds in the highest 
stage, even if the input pattern is 
deformed, changed in size, or shifted in 
position.* The network recognizes the 
“shape” of the pattern independent of its 
size and position. 

When two or more patterns are simul¬ 
taneously presented to the input layer 
Uco. two or more u<x cells may be acti¬ 
vated at first. However, all of these cells 
but one soon stop responding. Usually 
only one u C l cell continues to respond 
because of competition by lateral inhibi¬ 
tion between feature-extracting u s cells. 2 
Lateral inhibition works in the highest 
stage and in the intermediate stages. 




Self-organization of 
the network 


Figure 6. Illustration of the mechanism of pattern recognition in the forward 
paths. 6 


The self-organization of the network 
results from learning-without-a-teacher. 
The network by itself acquires the ability 
to classify and to recognize patterns cor¬ 
rectly on the basis of similarity in shape. 

In the initial state before learning, all the 
variable connections in the network have 


•It is difficult to state quantitatively to what degree the 
network can cope with deformation in patterns, 
because we do not have an appropriate mathematical 
measure to correctly express the psychological feeling 
of the deformation. We can get a rough idea of the 
degree of deformation that can be tolerated. For exam¬ 
ple, in an article to be published in April 1988, 12 Fig¬ 
ure 14 shows some examples of deformed patterns that 
the neocognitron recognized correctly. These exam¬ 
ples were obtained by a neocognitron trained by 
learning-with-a-teacher. Generally, the results 
obtained by learning-without-a-teacher are somewhat 
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Figure 7. Illustration of the principle for recognizing deformed patterns. 
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Figure 8. The process of reinforcement of the forward connections converging to a 
feature-extracting u s cell. The density of the shadow in a circle represents the inten¬ 
sity of the response of the cell, (a) shows the initial state before training, (b) shows 
stimulus presentation during the training, (c) shows the connections after rein¬ 
forcement. 


a strength of zero. In the network, rein¬ 
forcement of the forward connections 
comes first; the backward connections are 
reinforced later. Hence, during the process 
of reinforcement of the forward connec¬ 
tions, no backward signal flows in the net¬ 
work. The variable forward connections 
are reinforced in a manner similar to that 
in the neocognitron. 6,7 After finishing the 
reinforcement of the forward connections, 
the network reinforces the backward con¬ 
nections by the same amount as the for¬ 
ward connections with which they form 
pairs. 

The reinforcement of the forward con¬ 
nections is performed according to two 
principles. The first was introduced for the 
self-organization of the cognitron 4 
model. Specifically, among the cells situ¬ 
ated in a certain small area, only the one 
responding most strongly has its input con¬ 
nections reinforced. The amount of rein¬ 
forcement of each input connection to this 
maximum-output cell is proportional to 
the intensity of the response of the cell 
from which the relevant connection leads. 

This principle is applied to the variable 
input connections converging to feature- 
extracting u s cells. Both excitatory and 
inhibitory connections are reinforced fol¬ 
lowing this principle. 

Figure 8 illustrates this process of rein¬ 
forcement, showing only the forward con¬ 
nections converging to a u s cell. As shown 


in Figure 8a, the u s cell receives variable 
excitatory connections from a group of 
u c cells in the preceding layer. The cell 
also receives a variable inhibitory connec¬ 
tion from a subsidiary inhibitory cell, 
called a u S v cell. The u S v cell receives fixed 
excitatory connections from the same 
group of u c cells as does this u s cell, and 
always responds with the average intensity 
of the output of the u c cells. The initial 
strength of these variable connections is 
nearly zero. * 

Suppose this u s cell responds most 
strongly of the u s cells in its vicinity when 
a training stimulus is presented (see Figure 
8b). According to the first principle, vari¬ 
able connections leading from activated 
u c and u S v cells are reinforced as shown in 
Figure 8c. The variable excitatory connec¬ 
tions to the u s cell grow into a “template” 
that exactly matches the spatial distribu¬ 
tion of the response of the cells in the 
preceding layer. The inhibitory variable 
connections from the u S v cell are rein¬ 
forced at the same time, but not as strongly 
because the output of the u S v cell is not as 
large. 

After completion of the training, the u s 
cell acquires the ability to extract the fea¬ 


*Each u s cell has very weak and diffused excitatory 
connections only during the initial period of self- 
organization. Once a reinforcement of the input con¬ 
nections begins, these weak and diffuse initial connec¬ 
tions disappear. 12 


ture of the stimulus presented during the 
training period. Through the excitatory 
connections, the u s cell receives signals 
indicating the existence of the relevant fea¬ 
ture to be extracted. If an irrelevant feature 
is presented, the inhibitory signal from the 
u S v cell becomes stronger than the direct 
excitatory signals from the u c cells, and 
the response of the u s cell is suppressed. 
The u s cell is activated only when the rele¬ 
vant feature is presented. We could say 
that the u S v cell watches for the existence 
of irrelevant features. Thus, inhibitory 
u S v cells play an important role in endow¬ 
ing the feature-extracting u s cells with the 
ability to differentiate irrelevant features, 
and in increasing the selectivity of feature 
extraction. 

According to this principle, among the 
u s cells in a certain small area, only the 
one cell that yields the maximum output 
has its input connections reinforced. 
Because of the “winner-takes-all” nature 
of this principle, duplicate formation of 
cells that extract the same feature does not 
occur, and the formation of a redundant 
network can be prevented. Only the one 
cell giving the best response to a training 
stimulus is selected, and only that cell is 
reinforced so as to respond more appropri¬ 
ately to the stimulus. 

Once a cell is selected and reinforced to 
respond to a feature, the cell usually loses 
its responsiveness to other features. When 
a different feature is presented, usually a 
different cell yields the maximum output 
and has its input connections reinforced. 
Thus, “division of labor” among the cells 
occurs automatically. 

With this principle, the network also 
develops a self-repairing function. If a cell 
that has responded strongly to a stimulus 
is damaged and ceases to respond, another 
cell, which happens to respond more 
strongly than others, starts to grow and 
substitutes for the damaged cell. Until 
then, the larger response of the first cell 
had prevented the growth of a second cell. 

The second principle introduced for the 
self-organization of the network states that 
the maximum-output cell not only grows, 
but also controls the growth of neighbor¬ 
ing cells, working, so to speak, like a seed 
in crystal growth. Neighboring cells have 
their input connections reinforced in the 
same way as the seed cell. 

When a seed cell is selected from a sub¬ 
group of u s cells, all the other u s cells in 
the subgroup grow to have input connec¬ 
tions of the same spatial distribution as the 
seed cell. As a result, all the u s cells in a 
subgroup grow to receive input connec- 
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tions of identical spatial distribution where 
only the positions of the preceding u c 
cells have shifted in parallel with the posi¬ 
tions of the Us cells. Because connections 
develop iteratively in a subgroup, all the 
u s cells in the subgroup come to respond 
selectively to a particular feature. Differ¬ 
ences among these cells arise only from 
differences in position of the feature to be 
extracted. 


Backward paths 

The output of the recognition layer U CL 
is sent back to lower stages through back¬ 
ward paths. It reaches the recall layer Wco 
at the lowest stage of the backward paths. 
The backward signals are transmitted 
retracing the same route as the forward sig¬ 
nals, because the cells in the backward 
paths receive gate signals from the cells in 
the forward paths. Guided by the forward 
signal flow, the backward signals reach 
exactly the same positions at which the 
input pattern is presented. 

Since the backward signals are sent back 
only from the activated u C l cell, only the 
signal components corresponding to the 
recognized pattern reach the recall layer 
W co . Therefore, we can interpret the out¬ 
put of the recall layer as the result of seg¬ 
mentation, where only components 
relevant to a single pattern are selected 
from the stimulus. Even if the stimulus 
pattern now recognized is a deformed ver¬ 
sion of a training pattern, the deformed 
pattern is segmented and emerges with its 
deformed shape. 

Let us consider this process in more 
detail. First, look at the backward signals 
from an arbitrary w s cell to the w c cells of 
the preceding stage (see Figure 5). The net¬ 
work is designed so that the strength of the 
variable backward connections is auto¬ 
matically controlled in the following man¬ 
ner: After finishing the reinforcement of 
the forward connections (refer to Figure 
8), the backward connections descending 
from a w s cell are automatically rein¬ 
forced to have a strength proportional to 
the forward connections ascending to the 
u s cell paired with the w s cell. Conse¬ 
quently, if an excitatory forward path 
forms to a u s cell from a u c cell, an excita¬ 
tory backward path forms automatically 
from the corresponding w s cell to the cor¬ 
responding w c cell. This also holds for the 
inhibitory backward path via the subsidi¬ 
ary w sv cell, which corresponds to the u S v 
cell in the forward paths. Hence, depend¬ 
ing on whether a u s cell receives an over¬ 


Uco U co 



Figure 9. A simplified illustration of the forward signal flow in the network. 
Deformed stimuli (a) and (b) presented at different positions on the input layer 
(u co ) elicit the same response from the u c cell at the highest stage. The backward 
signals are controlled to retrace the same route as the forward signals, in the oppo¬ 
site direction. 


all excitatory or inhibitory effect from a 
u c cell through forward connections, the 
corresponding w c cell also receives an 
overall excitatory or inhibitory effect from 
the corresponding w s cell through back¬ 
ward connections. 

Corresponding to the fixed forward 
connections that converge to a Uc cell 
from a number of u s cells, many back¬ 
ward connections diverge from a w c cell 
towards w s cells (see Figure 5). However, 
we do not want all the w s cells receiving 
excitatory backward signals from the w c 
cell to be activated for the following rea¬ 
son:. To activate a u c cell in the forward 
path, the activation of at least one preced¬ 
ing u s cell is sufficient. Usually only a 
small number of preceding u s cells actu¬ 
ally are activated, as shown in Figure 9. To 
elicit a similar response from the w s cells 
in the backward paths, the network is syn¬ 
thesized in such a way that each w s cell 
receives excitatory backward signals from 
w c cells and a gate signal from the cor¬ 
responding u s cell. The w s cell is activated 
only when it receives a signal both from 
u s and w c cells. Because of this network 
architecture, in the backward paths from 
w c cells to w s cells, the signals retrace the 
same route as the forward signals from u s 
cells to u c cells. 


Gain control 

Interaction between forward and back¬ 
ward signals is not unilateral. Forward 
cells receive gain-control signals from the 
corresponding backward cells, and the for¬ 
ward signal flow is facilitated by the back¬ 
ward signals. The gain of each u c cell in 
forward paths is controlled by a signal 
from a corresponding w c cell (Figures 4 
and 5). When the w c cell is silent, the gain 
between the inputs and the output of the 
u c cell gradually attenuates from its initial 
value of 1.0 with the passage of time. 
When the w c cell is activated, however, 
the attenuated gain recovers. Thus, only 
the forward signals are facilitated in the 
paths in which backward signals flow. 

Now let’s consider a case in which a 
stimulus consisting of two or more pat¬ 
terns is presented. Let one of the u C l cells 
in the recognition layer be activated and 
one of the patterns in the stimulus recog¬ 
nized. Only the forward paths relevant to 
this pattern are facilitated by the action of 
backward signals from the u C l cell. The 
forward paths corresponding to other pat¬ 
terns gradually lose their responsiveness 
because they receive no facilitation. Atten¬ 
tion focuses selectively on only one of the 
patterns in the stimulus. 
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Threshold control 

When some part of the input pattern is 
missing and the feature that should exist 
there fails to be extracted in the forward 
paths, the backward signal flow stops at 
that point. In such a case, the threshold for 
extraction of that feature automatically 
lowers and the model tries to extract even 
vague traces of the undetected feature. The 
w cx cells detect the failure to extract a 
feature when the cells in the backward 
paths are active but the corresponding cells 
in the forward paths are not (Figures 4 and 
5). The signal from w C x cells weakens the 
efficiency of inhibition by u S v cells and 
virtually lowers the threshold for feature 
extraction by u s cells. (The signal from 
w C x cells works like a neuromodulator in 
biological systems.) Thus, u s cells are 
made to respond even to incomplete fea¬ 
tures to which they would not, in the nor¬ 
mal state, respond. 

Once a feature is thus extracted in the 
forward paths, the backward signal can be 
further transmitted to lower stages 
through the path unlocked by the newly 
activated forward cell. Hence, a complete 
pattern, including defective parts, emerges 
in the recall layer. Noise and blemishes 
have been eliminated from this pattern, 
because no backward signals return to 
their components. Thus, we can interpret 
the output of the recall layer W C o as an 
auto-associative recall from associative 
memory. 

Sometimes all the u C l cells in the recog¬ 
nition layer are silent. The no-response 
state of the u C l cells may occur, for 
instance, if the stimulus pattern differs 
greatly in shape from the original pattern, 
or if too many patterns are simultaneously 
presented to the input layer. If all the u C l 
cells in the recognition layer are silent, 
information processing of the network 
goes no further because no backward sig¬ 
nal flows in the network. 

The no-response detector shown at the 
far right in Figure 4 always monitors the 
response of the u C l cells. If all the u C l 
cells are silent, the no-response detector 
sends another threshold-control signal 
through path x (shown in Figure 4) to all 
the u s cells of all stages and lowers their 
threshold for feature extraction. The 
longer the silent state of the u C l cells con¬ 
tinues, the higher the value of the 
threshold-control signal. Hence, at least 
one u C l cell is activated after a certain 
time. 


Switching attention 

Suppose one pattern in a composite 
stimulus is being attended to and recog¬ 
nized. A momentary interruption of the 
backward signal-flow suffices to switch 
attention to another pattern. 

The gain of u c cells is designed to be 
controlled as follows when switching 
attention: After the disappearance of the 
facilitating signal from the corresponding 
w c cell in the backward path, each u c cell 
has its gain lowered if the gain was previ¬ 
ously kept high by facilitation. A decrease 
in gain occurs like fatigue, depending on 
the degree of the forced increase of the 
gain until then. On the other hand, the u c 
cell will recover its gain if the gain was 
previously attenuated. 

Because of this method of controlling 
the gain of the u c cells, signals cor¬ 
responding to the previous pattern have 
difficulty flowing through the forward 
paths. Usually another u C l cell, hitherto 
silent, will be activated. (If no u C l cell is 
activated, the no-response detector works 
until at least one u C l cell is activated.) 
Consequently, the backward signals from 
the newly activated u C l cell ease the flow 
of the forward signals for the new pattern. 
A repetition of this process switches atten¬ 
tion to each of the patterns in the stimulus 
figure in turn, and they are recognized and 
recalled one after another. 


Computer simulation 

Let’s look at the behavior of the model 
as simulated on a MicroVAX II minicom¬ 
puter, with a program written in Fortran. 
The simulated network has three stages of 
hierarchy (L = 3). The input layer U co has 
19 X 19 cells. The number of cells in the 
network totals about 41,399. * * 

The variable connections in the network 
were reinforced by learning-without-a- 
teacher. Figure 10 shows the five training 
patterns presented to the network during 
the learning period. Each of these patterns 
was presented only 11 times, enough to 
complete the self-organization. ** During 
the learning period, these patterns were 


"The number of cells actually used in the network 
depends on the set of training patterns presented dur¬ 
ing the training cycle. Because it is difficult to estimate 
in advance, this computer simulation series had a sur¬ 
plus. The total of 41,399, which excluded the cells in 
the non-response detector, included 19 x 19 u C o and 
w c0 ; 19x19x21 u s , and w sl ; 19xl9u sv , andw svl ; 
11 X11 X21 u C i, w c j, and WcxiJ 11 X 11 x27 u s2 , w s2 , 
u c2 , w c2 , and w cx2 ; 11x11 u sv2 and w sv2 ; 7 X 7 X 5 u S 3 
and w s3 ; 7x7 u sv3 and w sv3 ; and 1x1x5 u c3 . 


repeatedly presented only in this shape; 
deformed versions were not presented at 
all. After completing the learning cycle, all 
the variable connections were fixed. 

Figures 11 through 14 show the behavior 
of a network that has finished learning. In 
these figures, the response of the cells in 
the input layer U C o and the recall layer 
W co is shown in time sequence. The 
numeral to the upper left of each pattern 
represents time t after the start of stimu¬ 
lus presentation. *** The stimulus pattern 
given to this network is identical to the 
response of the input layer at t - 0, shown 
in the upper left of each figure. (Note that 
the input pattern p appears directly in layer 
U C o at t- 0, because no response is 
elicited from layer W C o at t< 0.) 

Figure 11 shows the response to a stimu¬ 
lus consisting of two juxtaposed patterns, 
“2” and “3.” In the recognition layer, not 
shown in this figure, the u C l cell cor¬ 
responding to pattern “2” happens to be 
activated first. The signal is fed back to the 
recall layer through backward paths, but 
the middle part of the segmented pattern 
“2’ ’ is missing because of the interference 
from the closely adjacent “3.” However, 
the missing part soon recovers because the 
components of pattern “3,” which is not 
being attended to, are gradually attenu¬ 
ated by the decrease of gain of the forward 
cells. 

At t = 5, the backward signal flow is 
interrupted for a moment to switch atten¬ 
tion. The mark ▼ denotes the execution of 
this operation. Since the facilitating signals 
from backward cells stop, forward cells 
whose responsiveness has been kept high 
by facilitation will now lose some of their 
responsiveness. On the other hand, cells 
whose responsiveness has been decreased 
will recover their responsiveness. Thus, the 
forward paths for pattern “2,” which have 
so far been facilitated, now lose their con¬ 
ductivity and the u C l cell for pattern “3” 
is activated in the recognition layer. Since 
the backward signals are fed back from 
this newly activated u C l cell, pattern “3” 
emerges in the recall layer. 

Figure 12 shows an example of the 
response to a stimulus consisting of 
superimposed patterns. The pattern “4” 
is isolated first, the pattern “2” next, and 
finally pattern “1” is extracted. 

♦•The reinforcement of the variable connections to 
cells of a higher stage was delayed until completion of 
the learning of the preceding stages. So, each of the 
training patterns was presented three, four, and four 
times for training stages 1, 2, and 3, respectively. 

* "Simulating the process of one step of I takes several 

seconds on a MicroVAX II, but computer time varies 

considerably depending on the presented stimulus. 
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Segmentation of individual patterns can 
be successful even if input patterns are 
deformed in shape or shifted in position. 
For example, “2” patterns in Figures 11 
and 12 differ in shape from the training 
pattern (Figure 10), but the segmented pat¬ 
terns appearing in the recall layer are iden¬ 
tical in shape to the stimulus patterns now 
presented. 

Pattern segmentation can be successful 
even for a stimulus consisting of two pat¬ 
terns from the same category. Figure 13 
shows such an example, in which the 
smaller “4” is segmented first, followed by 
the larger “4.” 

Figure 14 shows the response to a greatly 
deformed pattern with several parts miss¬ 
ing and contaminated by noise. Because of 
the large difference between the stimulus 
and the training pattern, no response is 
elicited from the recognition layer (not 
shown in the figure) at first. Accordingly, 
no feedback signal appears at the recall 
layer W co . The no-response detector 
detects this situation, and the threshold- 
control signal is sent to all feature- 
extracting cells in the network, which 
makes them respond more easily even to 
incomplete features. Thus, at time t= 2, 
the u C l cell for “2” is activated in the 
recognition layer, and backward signals 
are fed back from it. In the pattern now 
sent back to the recall layer W co , noise 
has been completely eliminated, and some 
of the missing parts have begun to be inter¬ 
polated. This partly interpolated signal, 
namely the output of the recall layer W C o, 
is again fed back positively to the input 
layer U C o- The interpolation continues 
gradually while the signal circulates 
through the feedback loop, and finally the 
missing part of the stimulus is completely 
filled in. The missing part is interpolated 
quite naturally, despite the considerable 
difference in shape between the stimulus 
and the training pattern. 

In the pattern for which interpolation 
has already finished, the horizontal bar at 
the bottom of the “2” is shorter than in the 
training pattern. But no matter how short 
the horizontal bar, the pattern is a perfect 
character “2.” Hence, this component of 
the pattern is left intact and is reproduced 
like the stimulus pattern. The deformation 
of the stimulus pattern is tolerated, and 
only indispensable missing parts are nat¬ 
urally interpolated, without strain. It is 
also important to note that noise has been 
completely eliminated, because attention 
is not focused on the components of the 
noise. 

As we can see from these experiments, 




Figure 10. Five training patterns used for learning. 



Figure 11. An example of the response to juxtaposed patterns. 
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Figure 12. An example of the response to superimposed patterns. 
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Figure 13. An example of the response to a stimulus consisting of two patterns of 
the same category. 


We can also consider this model one of 
associative memory, with the ability to 
repair imperfect patterns. In contrast to 
earlier models, this model has perfect 
associative recall, even for deformed pat¬ 
terns, without regard to their positions. 

Finally, the model can learn. We can 
train it to recognize any set of patterns. 
Hence, we can design a universal system 
that we can use, after training, for an indi¬ 
vidual purpose. □ 
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Figure 14. An example of the response to an incomplete distorted pattern with 
noise. 
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The ART of Adaptive Pattern 
Recognition by a 
Self-Organizing Neural 
Network 

Gail A. Carpenter and Stephen Grossberg 
Boston University 


O ne of the central goals of com¬ 
puter science is to design intelli¬ 
gent machines capable of 
autonomous learning and skillful perfor¬ 
mance within complex environments that 
are not under strict external control. Many 
scientists have turned to a study of human 
capabilities as a source of new ideas for 
designing such machines. When a scientist 
undertakes such a study, he or she encoun¬ 
ters a number of basic issues, of which we 
are all aware through our own personal 
experiences: 

Why do we pay attention? Why do we 
learn expectations about the world? In 
particular, how do we cope so well with 
unexpected events? And, how do we man¬ 
age to do as well as we do when we are on 
our own and do not have a teacher as a 
guide? How do we learn what combina¬ 
tions of facts are useful for dealing with a 
given situation, and what combinations of 
facts are irrelevant? How do we recognize 
familiar facts so quickly even though we 
have stored many other pieces of informa¬ 
tion? How do we combine knowledge 
about the external world with information 
about our internal needs to quickly make 
decisions that have a good chance of satis¬ 
fying those needs? Finally, what do all of 
these properties have in common? 
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The adaptive resonance 
theory suggests a 
solution to the 
stability-plasticity 
dilemma facing 
designers of learning 
systems. 


The stability-plasticity 
dilemma and ART 

Researchers have found one answer to 
these questions through the attempt to 
solve a basic design problem, called the 
stability-plasticity dilemma, faced by all 
intelligent systems capable of autono¬ 
mously adapting in real time to unexpected 
changes in their world. A developing the¬ 

0018-9162/88/03<XWX)77$01.00 © 1988 IEEE 


ory called adaptive resonant theory or 
ART, su ggests a solution to this problem. 

The stability-plasticity dilemma asks: 
How can a learning system be designed to 
remain plastic, or adaptive, in response to 
significant events and yet remain stable in 
response to irrelevant events? How does 
the system know how to switch between its 
stable and its plastic modes to achieve sta¬ 
bility without rigidity and plasticity with¬ 
out chaos? In particular, how can it 
preserve its previously learned knowledge 
while continuing to learn new things? And, 
what prevents the new learning from wash¬ 
ing away the memories of prior learning? 

We can easily dramatize the ubiquity of 
this problem: Imagine that you grew up in 
Boston before moving to Los Angeles, but 
periodically return to Boston to visit your 
parents. Although you may need to learn 
many new things to enjoy life in Los 
Angeles, these new learning experiences do 
not prevent you from remembering how to 
find your parent’s house or otherwise get 
around Boston. A multitude of similar 
examples illustrate our ability to success¬ 
fully adapt to environments where rules 
may change—without necessarily forget¬ 
ting our old skills. Moreover, we are capa¬ 
ble of successfully adapting to 
environments where rules may change 
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unpredictably, and we can do so even if no 
one tells us that the environment has 
changed. We can adapt, in short, without 
a teacher, through direct confrontation 
with our experiences. Such adaptation is 
called self-organization in the network 
modeling literature. 

One of the key computational ideas 
rigorously demonstrated within the adap¬ 
tive resonance theory is that top-down 
learned expectations focus attention upon 
bottom-up information in a way that pro¬ 
tects previously learned memories from 
being washed away by new learning, and 
enables new learning to be automatically 
incorporated into the total knowledge base 
of the system in a globally self-consistent 
way. 

The ART architectures discussed here 
are neural networks that self-organize sta¬ 
ble recognition codes in real time in 
response to arbitrary sequences of input 
patterns. Within such an ART architec¬ 
ture, the process of adaptive pattern recog¬ 
nition is a special case of the more general 
cognitive process of hypothesis discovery, 
testing, search, classification, and learn¬ 
ing. This property opens up the possibil¬ 
ity of applying ART systems to more 
general problems of adaptively processing 
large abstract information sources and 
databases. This article outlines the main 
computational properties of these ART 
architectures, while comparing and con¬ 
trasting these properties with those of 
alternative learning and recognition sys¬ 
tems. Technical details are described in 
greater detail elsewhere, 1 ' 2 and several 
books collect articles in which the theory 
was developed through the analysis and 
prediction of interdisciplinary data about 
the brain and behavior. 3 ’ 4 

Competitive learning 
models 

ART models grew out of an analysis of 
a simpler type of adaptive pattern recog¬ 
nition network, often called a competitive 
learning model. Competitive learning 
models developed in the early 1970s 
through contributions of Christoph von 
der Malsburg 5 and Stephen Grossberg, 
leading to the description of these models 
in 1976 in several forms in which they are 
used today. 6 Authors such as Shun-ichi 
Amari, 7 Leon Cooper, 8 and Teuvo 
Kohonen 9 have further developed these 
models. Kohonen 9 has made particularly 
strong use of competitive learning in his 


work on self-organizing maps. 
Grossberg 4 has provided a historical dis¬ 
cussion of the development of competitive 
learning models. 

In a competitive learning model (see Fig¬ 
ure 1), a stream of input patterns to a net¬ 
work F x can tram the adaptive weights, or 
long-term memory (LTM) traces, that 
multiply the signals in the pathways from 
to a coding level F 2 . In the simplest 
such model, input patterns to Ft are nor¬ 
malized before passing through the adap¬ 
tive filter defined by the pathways from 
Fi to F 2 . Level F 2 is designed as a competi¬ 
tive network capable of choosing the node 
which receives the largest total input 
(“winner-take-all”). The winning popula¬ 
tion then triggers associative pattern learn¬ 
ing within the vector of LTM traces which 
sent its inputs through the adaptive filter. 

For example, as in Figure 1, let /, 
denote the input to the tth node v, of Ft , i 

= 1,2.M; let x t denote the activity, or 

short-term memory (STM) trace, of v,; let 
Xj denote the activity, or STM trace, of 
they'th node v, of F 2 , j = M + 
and let Zy denote the adaptive weight, or 
long-term memory (LTM) trace, of the 
pathway from v, to vThen let 


( 1 ) 


be the normalized activity of v, in 
response to the input pattern I = (7i, 
For simplicity, let the output 
signal S, of v, equal x,. Let 


(2) 


be the total signal received at y, from F u 
let 

fl if7}>max(7V A:*/') (3) 

X J ~ 10 if7}<max(7 *: k±j) 

summarize the fact that the node Xj in F 2 
which receives the largest signal is chosen 
for short-term memory storage, and let a 
differential equation 

4~Zij = exj ( - Zy + Xi) W 

at 

specify that only the vector Zj = (r v , 
Zy,...,ZMj ) of adaptive weights which abut 
the winning node v, is changed due to 
learning. Vector Zj learns by reducing the 


error between itself and the normalized 

vector X = (x,,x 2 . x M ) in the direction 

of steepest descent. 

Several equivalent ways describe how 
such a system recognizes input patterns / 
presented to F,. The winning node v, in F 2 
is said to code, classify, cluster, partition, 
compress, or orthogonalize these input 
patterns. In engineering, such a scheme is 
said to perform adaptive vector quantiza¬ 
tion. In cognitive psychology, it is said to 
perform categorical perception. 3 

In categorical perception, input patterns 
are classified into mutually exclusive 
recognition categories separated by sharp 
categorical boundaries. A sudden switch 
in pattern classification can occur if an 
input pattern is deformed so much that it 
crosses one of these boundaries and 
thereby causes a different node Vj to win 
the competition within F 2 . Categorical 
perception, in the strict sense of the word, 
occurs only if F 2 makes a choice. In more 
general competitive learning models, com¬ 
pressed but distributed recognition codes 
are generated by the model’s coding level 
or levels. 3,4,9 

In response to certain input environ¬ 
ments, a competitive learning model pos¬ 
sesses very appealing properties. It has 
been mathematically proved 6 that, if not 
too many input patterns are presented to 
Fi, or if the input patterns form not too 
many clusters, relative to the number of 
coding nodes in F 2 , then learning of the 
recognition code eventually stabilizes and 
the learning process elicits the best distri¬ 
bution of LTM traces consistent with the 
structure of the input environment. 

Despite the demonstration of input 
environments that can be stably coded, it 
has also been shown, through explicit 
counterexamples, 1,2,6 that a competitive 
learning model does not always learn a 
temporally stable code in response to an 
arbitrary input environment. In these 
counterexamples, as a list of input patterns 
perturbs level Fi through time, the 
response of level F 2 to the same input pat¬ 
tern can differ on each successive presen¬ 
tation of that input pattern. Moreover, the 
F 2 response to a given input pattern might 
never settle down as learning proceeds. 

Such unstable learning in response to a 
prescribed input is due to the learning that 
occurs in response to the other intervening 
inputs. In other words, the network’s 
adaptability, or plasticity, enables prior 
learning to be washed away by more recent 
learning in response to a wide variety of 
input environments. In fact, infinitely 
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many input environments exist in which 
periodic presentation of just four input 
patterns can cause temporally unstable 
learning. 1,2 

Learning can also become unstable due 
to simple changes in an input environment. 
Changes in the probabilities of inputs, or 
in the deterministic sequencing of inputs, 
can readily wash away prior learning. 
Moreover, this instability problem is not 
peculiar to competitive learning models. 
The problem is a basic one because it arises 
from a combination of the very features of 
an adaptive coding model that, on the sur¬ 
face, seem so desirable: its ability to learn 
from experience and its ability to code, 
compress, or categorize many patterns 
into a compact internal representation. 
Due to these properties, when a new input 
pattern / retrains a vector Zj of LTM 
traces, the set of all input patterns coded 
by Vj also changes because a change in Z y 
in Equation 2 can reverse the inequalities 
in Equation 3 in response to many of the 
input patterns previously coded by v y . 

Learning systems that can become 
unstable in response to many input envi¬ 
ronments cannot safely be used in autono¬ 
mous machines that might be 
unexpectedly confronted by one of these 
environments on the job. Adaptive reso¬ 
nance theory was introduced in 1976 to 
show how to embed a competitive learn¬ 
ing model into a self-regulating control 
structure whose autonomous learning and 
recognition proceed stably and efficiently 
in response to an arbitrary sequence of 
input patterns. 

Self-stabilized learning 
in an arbitrary input 
environment 



Figure 1. Stages of bottom-up activation: The input pattern / generates a pattern of 
STM activation X = Qc lt x 2 ,...,x M ) across F t . Sufficiently active F, nodes emit 
bottom-up signals to F 2 . This signal pattern S is multiplied, or gated, by long-term 
memory (LTM) traces Zij within the F x -* F 2 pathways. The LTM-gated signals are 
summed before activating their target nodes in F 2 . This LTM-gated and summed 
signal pattern T, where 7} = generates a pattern of STM activation Y = 

(x M+l ,...,x N ) across F 2 . 
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Figure 2 schematizes a typical example 
from a class of architectures called ART 
1. It has been mathematically proved 1 
that an ART 1 architecture is capable of 
stably learning a recognition code in 
response to an arbitrary sequence of 
binary input patterns until it utilizes its full 
memory capacity. Moreover, the adaptive 
weights, or LTM traces, of an ART 1 sys¬ 
tem oscillate at most once during learning 
in response to an arbitrary binary input 
sequence, yet do not get trapped in spuri¬ 
ous memory states or local minima. After 
learning self-stabilizes, the input patterns 
directly activate the F 2 codes that best rep¬ 
resent them. 

As in a competitive learning model, an 
ART architecture encodes a new input pat- 





Figure 2. Matching by the 2/3 Rule: In (a), a top-down expectation from F 2 inhibits 
the attentiona) gain control source as it subliminally primes target F, cells. Dotted 
outline depicts primed activation pattern. In (b), only F x cells that receive bottom- 
up inputs and gain control signals can become supraliminaliy active. In (c), when a 
bottom-up input pattern and a top-down template are simultaneously active, only 
those F t cells that receive inputs from both sources can become supraliminaliy 
active. In (d), intermodality inhibition can shut off the F x gain control source and 
thereby prevent a bottom-up input from supraliminaliy activating F u as when 
attention shifts to a different input channel. Similarly, disinhibition of the F x gain 
control source in (a) may cause a top-down prime to become supraliminal, as dur¬ 
ing an internally willed fantasy. 
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tern, in part, by changing the adaptive 
weights, or LTM traces, of a bottom-up 
adaptive filter. This filter is contained in 
the pathways leading from a feature rep¬ 
resentation field F\ to a category repre¬ 
sentation field F 2 . In an ART network, 
however, a second, top-down adaptive fil¬ 
ter, contained in the pathways from F 2 to 
F u leads to the crucial property of code 
self-stabilization. Such top-down adaptive 
signals play the role of learned expecta¬ 
tions in an ART system. Before consider¬ 
ing details about how the ART control 
structure automatically stabilizes the 
learning process, we will sketch how self¬ 
stabilization occurs in intuitive terms. 

Suppose that an input pattern / activates 
F\. Let Fi in turn activate the code, or 
hypothesis, symbolized by the node Vji at 
F 2 which receives the largest total signal 
from F\. Then, F 2 quickly reads out its 
learned top-down expectation to F u 
whereupon the bottom-up input pattern 
and top-down learned expectation are 
matched across F\. If these patterns are 
badly matched, then a mismatch event 
takes place at F\ which triggers a reset 
burst to F 2 . This reset burst shuts off node 
Vji for the remainder of the coding cycle, 
and thereby deactivates the top-down 
expectation controlled by v jx . Then, F, 
quickly reactivates essentially the same 
bottom-up signal pattern to F 2 as before. 
Level F 2 reinterprets this signal pattern, 
conditioned on the hypothesis that the 
earlier choice v yl was incorrect, and 
another node v j2 is automatically chosen. 

The parallel search, or hypothesis test¬ 
ing, cycle of bottom-up adaptive filtering 
from F[ to F 2 , code (or hypothesis) selec¬ 
tion at F 2 , read-out of a top-down learned 
expectation from F 2 to F\ , matching at Fj, 
and code reset at F 2 now repeats itself 
automatically at a very fast rate until one 
of three possibilities occurs: (1) a node v jm 
is chosen whose top-down expectation 
approximately matches input /; (2) a previ¬ 
ously uncommitted F 2 node is selected; or 
(3) the full capacity of the system is used 
and cannot accommodate input I. Until 
one of these outcomes prevails, essentially 
no learning occurs, because all the STM 
computations of the hypothesis testing 
cycle proceed so quickly that the more 
slowly varying LTM traces in the bottom- 
up and top-down adaptive filters cannot 
change in response to them. Significant 
learning occurs in response to an input pat¬ 
tern only after the hypothesis testing cycle 
that it generates comes to an end. 

If the hypothesis testing cycle ends in an 
approximate match, then the bottom-up 


ART architectures 
differ from other 
popular neural 
network learning 
schemes in a number 
of basic ways. 


input pattern and the top-down expecta¬ 
tion quickly deform the activity pattern X 
= (xi,x 2 . x M ) across Fi into a net pat¬ 

tern that computes a fusion, or consensus, 
between the bottom-up and top-down 
information. This fused pattern represents 
the attentional focus of the system. When 
fusion occurs, the bottom-up and top- 
down signal patterns mutually reinforce 
each other via feedback and the system 
gets locked into a resonant state of STM 
activation. Only then can the LTM traces 
learn. What they learn is any new informa¬ 
tion about the input pattern represented 
within the fused activation pattern across 
Fi. The fact that learning occurs only in 
the resonant state suggested the name 
“adaptive resonance theory.” Thus, the 
system allows alteration of one of its prior 
learned codes only if an input pattern is 
sufficiently similar to what it already 
knows to risk a further refinement of its 
knowledge. 

If the hypothesis testing cycle ends by 
selecting an uncommitted node at F 2 , then 
the bottom-up and top-down adaptive 
filters linked to this node learn the Ff acti¬ 
vation pattern generated directly by the 
input. No top-down alteration of the F, 
activation pattern occurs in this case. If the 
full capacity has been exhausted and no 
adequate match exists, learning is auto¬ 
matically inhibited. 

In summary, an ART network refines 
its already learned codes based upon new 
information that can be safely accommo¬ 
dated into them via approximate matches, 
selects new nodes for initiating learning of 
novel recognition categories, or defends its 
fully committed memory capacity against 
being washed away by the incessant flux of 
new input events. 


Alternative learning 
schemes 

Many computational details have been 
worked out to make this scheme work well 
in an autonomous setting. 1,2 Before we 
describe some of these details, we should 
note that ART architectures differ from 
other popular neural network learning 
schemes, such as autoassociators, the 
Boltzmann machine, and back propaga¬ 
tion 9 ' 11 in a number of basic ways. These 
differences are schematized in Table 1. 

The most robust difference is that an 
ART architecture is designed to learn 
quickly and stably in real time in response 
to a possibly nonstationary world with an 
unlimited number of inputs until it utilizes 
its full memory capacity. Many alternative 
learning schemes become unstable unless 
they learn slowly in a controlled stationary 
environment with a carefully selected total 
number of inputs and do not use their full 
memory capacity. 12 For example, a learn¬ 
ing system that is not self-stabilizing 
experiences a capacity catastrophe in 
response to an unlimited number of 
inputs: New learning washes away mem¬ 
ories of prior learning if too many inputs 
perturb the system. To prevent this from 
happening, either the total number of 
input patterns that perturbs the system 
needs to be restricted, or the learning pro¬ 
cess itself must be shut off before the 
capacity catastrophe occurs. 

Shutting off the world is not possible in 
many real-time applications. In particular, 
how can such a system allow a familiar 
input to be processed and recognized, but 
block the processing of a novel input pat¬ 
tern before the pattern destabilizes its prior 
learning? In the absence of a self¬ 
stabilization mechanism, an external 
teacher must act as the system’s front end 
to independently recognize the inputs and 
make the decision. Shutting off learning at 
just the right time to prevent either a 
capacity catastrophe or a premature termi¬ 
nation of learning would also require an 
external teacher. In either case, the exter¬ 
nal teacher must be able to carry out the 
recognition tasks that the learning system 
was supposed to carry out. Hence, non- 
self-stabilizing learning systems are not 
capable of functioning autonomously in 
ill-controlled environments. 

In learning systems that need an exter¬ 
nal teacher to supply the correct represen¬ 
tation to be learned, the learning process 
is often driven by mismatch between 
desired and actual outputs. 10 ' 11 Such 
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Table 1. ART architectures compared to other learning schemes. 


ART architecture 

Alternative learning properties 

Real-time (on-line) learning 

Lab-time (off-line) learning 

Nonstationary world 

Stationary world 

Self-organizing (unsupervised) 

Teacher supplies correct answer 
(supervised) 

Memory self-stabilizes in response to 
arbitrarily many inputs 

Capacity catastrophe in response to 
arbitrarily many inputs 

Effective use of full memory capacity 

Can only use partial memory capacity 

Maintain plasticity in an unexpected 
world 

Externally shut off plasticity to prevent 
capacity catastrophe 

Learn internal top-down expectations 

Externally impose costs 

Active attentional focus regulates 
learning 

Passive learning 

Slow or fast learning 

Slow learning or oscillation catastrophe 

Learn in approximate-match phase 

Learn in mismatch phase 

Use self-regulating hypothesis testing to 
globally reorganize the energy 
landscape 

Use noise to perturb system out of local 
minima in a fixed energy landscape 

Fast adaptive search for best match 

Search tree 

Rapid direct access to codes of familiar 
events 

Recognition time increases with code 
complexity 

Variable error criterion (vigilance 
parameter) sets coarseness of 
recognition code in response to 
environmental feedback 

Fixed error criterion in response to 
environmental feedback 

All properties scale to arbitrarily large 
system capacities 

Key properties deteriorate as system 
capacity increased 


schemes must learn slowly, and in a sta¬ 
tionary environment, or risk unstable 
oscillations in response to the mismatches. 
They can also be destabilized if the exter¬ 
nal teaching signal is noisy, because such 
noise creates spurious mismatches. 

These learning models also tend to get 
trapped in local minima, or globally incor¬ 
rect solutions. Models such as simulated 
annealing and the Boltzmann machine 10 
use internal system noise to escape local 
minima and approach a more global mini¬ 
mum. An externally controlled (tempera¬ 
ture) parameter regulates this process by 
making it converge ever more slowly to a 
critical value. 

In contrast, approximate matches, 
rather than mismatches, drive the learning 
process in ART. Learning in the 
approximate-match mode enables rapid 
and stable learning to occur while buffer¬ 
ing the system’s memory against external 
noise. The hypothesis testing cycle replaces 
internal system noise as a scheme for dis¬ 
covering a globally correct solution. It 
does not use an externally controlled tem¬ 
perature parameter or teacher. 

Matching by the 2/3 
rule 

One of the key constraints on the design 
of the ART 1 architecture is its rule for 
matching a bottom-up input pattern with 
a top-down expectation at F\. This rule, 
called the 2/3 Rule, 1 is necessary to regu¬ 
late both the hypothesis testing cycle and 
the self-stabilization of learning in an ART 
1 system. 

The 2/3 Rule reconciles two properties 
whose simplicity tends to conceal their 
fundamental nature: In response to an 
arbitrary bottom-up input pattern, F\ 
nodes can be supraliminally activated; that 
is, activated enough to generate output sig¬ 
nals to other parts of the network and 
thereby to initiate the hypothesis testing 
cycle. In response to an arbitrary top- 
down expectation, however, F, nodes are 
only subliminally activated; they sensitize, 
prepare, or attentionally prime Fi for 
future input patterns that may or may not 
generate an approximate match with this 
expectation, but do not, in themselves, 
generate output signals. Such a subliminal 
reaction enables an ART system to antic¬ 
ipate future events and thus to function as 
an “intentional” machine. In particular, 
if an attentional prime is locked into place 
by a high-gain top-down signal source, 
then an ART system can automatically 


suppress all inputs that do not fall into a 
sought-after recognition category, yet 
amplify and hasten the processing of all 
inputs that do. 3 

To implement the 2/3 Rule, we need to 
assure that F, can distinguish between 
bottom-up and top-down signals, so that 
it can supraliminally react to the former 
and subliminally react to the latter. In 
ART 1, this distinction is determined by a 
third F l input source, called an atten¬ 
tional gain control channel, that responds 
differently to bottom-up and top-down 
signals. 

Figure 2 describes how this gain control 
source works. When activated, it excites 
each F\ node equally. The 2/3 Rule says 
that at least two out of three input sources 
are needed to supraliminally activate an 


F\ node; the three are a bottom-up input, 
a top-down input, and a gain control 
input. In the top-down processing mode 
(see Figure 2a), each F t node receives a 
signal from at most one input source and, 
hence, is only subliminally activated. In 
the bottom-up processing mode (Figure 
2b), each active bottom-up pathway can 
turn on the gain control node, whose out¬ 
put, once on, is independent of the total 
number of active bottom-up pathways. 
Then, all F, nodes receive at least a gain 
control input, but only those nodes that 
also receive a bottom-up input are 
supraliminally activated. 

When both bottom-up and top-down 
inputs reach F\ (see Figure 2c), the gain 
control source is shut off, so that only 
those Fi nodes which receive top-down 
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Figure 3. ART 1 system: Two successive stages, F x and F 2 , of the attentional sub¬ 
system encode patterns of activation in short-term memory (STM). Bottom-up and 
top-down pathways between F\ and F 2 contain adaptive long-term memory (LTM) 
traces which multiply the signals in these pathways. The remainder of the circuit 
modulates these STM and LTM processes. Modulation by gain control enables F x 
to distinguish between bottom-up input patterns and top-down priming, or expec¬ 
tation, patterns, and to match these bottom-up and top-down patterns by the 2/3 
Rule. Gain control signals also enable F 2 to react supraliminally to signals from F, 
while an input pattern is on. The orienting subsystem generates a reset wave to F 2 
when sufficiently large mismatches between bottom-up and top-down patterns 
occur at F x . This reset wave selectively and enduringly inhibits previously active F 2 
cells until the input is shut off. 


(a) 

(b) ,0, DOWN TtMFUTES 

(<3^ T0N-00.NTCNJU.lt! 

(b) K t 

■ Sr—’ 

'SB 1 . 

" "« 5 

" K FCD7K 

1 B £ 

1 B £ 

oi - r U T 

«L FLDTK 

C. P = - 5 

5 c P = * 

«M IJl.Tg 

-H 

“> G, 

‘ D ?PB 

“N n,JN 

« N FLD7KM 

s E n 

5 E ??9I 

E.I.TN 

»□ FLD7KM 

• - G, E 

• E £« E 

»P [M.TM 

TV9T'; : 1'£ 

' E r& 

’ E 

”■ 


,H r £ 

' H PCOEH 

»r ri.Ti-i 

»R FLDJKNPH 

ri-ix 

. I FCBIH 

" s r 

»S Fls-DJKMPS 

" T rH £ 

»T FCDIH 

Jo T n.TH 

J.T FLDTKNPBT 


Figure 4. Alphabet learning: Code learning by ART 1 in response to the first 
presentation of the first 20 letters of the alphabet is shown. Two different vigilance 
levels were used, q = .5 and e = -8. Each row represents the total code learned 
after the letter at the left-hand column of the row is presented at F x . Each column 
represents the learning, through time, of the top-down LTM vector, or expectation, 
corresponding to the F 2 node whose index is listed at the top of the column. These 
LTM vectors do not, in general, equal the input patterns which change them 
through learning. Instead, each expectation acts like a novel type of prototype for 
the entire set of practiced input patterns coded by that node, as well as for 
unfamiliar input patterns that share invariant properties with this set. The simula¬ 
tion illustrates the “fast learning” case, in which the altered LTM traces reach a 
new equilibrium in response to each new stimulus. Slow learning is more gradual 
than this. 


confirmation of the bottom-up input are 
supraliminally activated. In this case, the 
2/3 Rule maintains supraliminal activity 
only within the spatial intersection of the 
bottom-up input pattern and the top-down 
expectation. Consequently, if a bottom-up 
input pattern, as in Figure 2b, causes the 
read-out of a badly matched top-down 
expectation, as in Figure 2c, then the total 
number of supraliminally active F x nodes 
can suddenly decrease, thereby causing a 
decrease in the total output signal emitted 
by F\. This property is used heavily in 
controlling the hypothesis testing and self¬ 
stabilization processes, as we will show 
next. 


Automatic control of 
hypothesis testing 

An ART architecture automates its 
hypothesis testing cycle through interac¬ 
tions between an attentional subsystem 
and an orienting subsystem. These sub¬ 
systems in the ART 1 architecture are 
schematized in Figure 3. 

The orienting subsystem A generates an 
output signal only when a mismatch 
occurs between a bottom-up input pattern 
and top-down expectation at level F { of 
the attentional subsystem. Thus, A func¬ 
tions like a novelty detector. The output 
signal from A is called an STM reset wave 
because it selectively inhibits the active 
node(s) at level F 2 of the attentional sub¬ 
system. The novelty detector A thereby 
disconfirms the F 2 hypothesis that led to 
the F\ mismatch. 

The 2/3 Rule controls the reset wave 
emitted by A as follows: When a bottom- 
up input pattern is presented, each of the 
active input pathways to F x also sends a 
signal to the orienting subsystem A , where 
all of these signals are added up. When the 
input pattern activates F\, each of the 
activated F x nodes sends an inhibitory sig¬ 
nal to A . The system is designed so that the 
total inhibitory signal is larger than the 
total excitatory signal. Thus, in the 
bottom-up mode, the balance between 
active F { nodes and active input lines pre¬ 
vents a reset wave from being triggered. 
(Note that level F, in ART 1 is not nor¬ 
malized as it was in Equation 1 of the com¬ 
petitive learning model and in the ART 2 
systems discussed below. The decision of 
whether and how to normalize depends 
upon the design of the whole system.) 

This balance is upset when a top-down 
expectation is read out that mismatches the 
bottom-up input pattern at F x . As in Fig- 
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ure 2c, the total output from F x then 
decreases by an amount that grows with 
the severity of the mismatch. If the attenu¬ 
ation is sufficiently great, then inhibition 
from Fi to A can no longer prevent A 
from emitting a reset wave. A parameter 
q called the vigilance parameter determines 
how large a mismatch will be tolerated 
before A emits a reset wave. High vigilance 
forces the system to search for new cate¬ 
gories in response to small differences 
between input and expectation. Then, the 
system learns to classify input patterns into 
a large number of fine categories. Low vig¬ 
ilance enables the system to tolerate large 
mismatches and thus group together input 
patterns according to a coarse measure of 
mutual similarity. The vigilance parame¬ 
ter may be placed under external control, 
being increased, for example, when the 
network is “punished” for failing to dis¬ 
tinguish two inputs that give rise to differ¬ 
ent consequences. 1,3 

Figure 4 schematizes learning in 
response to the first 20 input presentations 
of a computer simulation of alphabet 
learning. After presenting the 20th input, 
nine recognition categories have formed 
when q = .8, but only four categories have 
been formed when q - .5. In this com¬ 
puter experiment, learning self-stabilized 
after at most three presentations of the 26 
letters at any level of vigilance, and the 
learned LTM codes were more abstract— 
that is, less letter-like—at lower levels of 
vigilance. 

Figure 5 illustrates how these properties 
of the interaction between levels F u F 2 , 
and A regulate the hypothesis testing cycle 
of the ART 1 system. In Figure 5a, an 
input pattern /generates an STM activity 
pattern X across F\. The input pattern / 
also excites the orienting subsystem A , but 
pattern X at F\ inhibits A before it can 
generate an output signal. Activity pattern 
X also elicits an output pattern S which 
activates the bottom-up adaptive filter T 
= ZS, where Z is the matrix of bottom-up 
LTM traces. As a result, an STM pattern 
Y becomes active at F 2 . In Figure 5b, pat¬ 
tern Y generates a top-down output U 
through the adaptive filter V = ZU, where 
Z is the matrix of top-down LTM traces. 
Vector Kis the top-down expectation read 
into F\. Expectation V mismatches input 
/, significantly inhibiting STM activity 
across F x . The amount by which activity 
in X is attenuated to generate the activity 
pattern X* depends upon how much of the 
input pattern / is encoded within the expec¬ 
tation V, via the 2/3 Rule. 

When a mismatch attenuates STM 



activity across F u the total size of the 
inhibitory signal from F x to A is also 
attenuated. If the attenuation is suffi¬ 
ciently great, inhibition from F\ to A can 
no longer prevent the arousal source A 
from firing. Figure 5c depicts how disin- 
hibition of A releases an arousal burst to 
F 2 which equally, or nonspecifically, 
excites all the F 2 cells. The cell popula¬ 
tions of F 2 react to such an arousal signal 
in a state-dependent fashion. In the special 
case that F 2 chooses a single population 
for STM storage, the arousal burst selec¬ 
tively inhibits, or resets, the active popu¬ 
lation in F 2 . This inhibition is 
long-lasting. 

In Figure 5c, inhibition of Y leads to 
removal of the top-down expectation V, 
and thereby terminates the mismatch 



between /and V. Input pattern /can thus 
reinstate the original activity pattern X 
across F u which again generates the out¬ 
put pattern S from F\ and the input pat¬ 
tern T to F 2 . Due to the enduring 
inhibition at F 2 , the input pattern T can 
no longer activate the original pattern Y at 
F 2 . Level F 2 has been conditioned by the 
disconfirmation of the original hypothe¬ 
sis. A new pattern Y* is thus generated at 
F 2 by / (see Figure 5d). 

The new activity pattern Y* reads out a 
new top-down expectation V*. If a mis¬ 
match again occurs at F u the orienting 
subsystem is again engaged, leading to 
another arousal-mediated reset of STM at 
F 2 . In this way, a rapid series of STM 
matching and reset events may occur. Such 
an STM matching and reset series controls 


Figure 5. ART 1 hypothesis testing cycle: In (a), the input pattern / generates the 
STM activity pattern X at F x as it activates A. Pattern X both inhibits A and gener¬ 
ates the bottom-up signal pattern S. Signal pattern S is transformed via the adap¬ 
tive filter into the input pattern T = ZS, which activates the compressed STM 
pattern Y across F 2 . In (b), pattern Y generates the top-down signal pattern U 
which is transformed by the top-down adaptive filter V = ZU into the expectation 
pattern V. If V mismatches / at F t , then a new STM activity pattern A* is generated 
at F\. The reduction in total STM activity that occurs when X is transformed into 
X* causes a decrease in the total inhibition from F t to A. In (c), then, the input- 
driven activation of A can release a nonspecific arousal wave to F 2 , which resets the 
STM pattern Y at F 2 . In (d), after Kis inhibited, its top-down expectation is elimi¬ 
nated, and X can be reinstated at F\. Now X once again generates input pattern T 
to F 2 , but since K remains inhibited T can activate a different STM pattern K* at 
F 2 . If the top-down expectation due to K* also mismatches / at F u then the rapid 
search for an appropriate F 2 code continues. 
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Figure 6. Category grouping by ART 2 of 50 analog input patterns into 34 recogni¬ 
tion categories. Each input pattern / is depicted by a graph as a function of abscissa 
values i (i = with successive ordinate /* values connected by straight lines. 

The category structure established upon one complete presentation of the 50 inputs 
thereafter remains stable if the same inputs are presented again. 



Figure 7. Lower vigilance implies coarser grouping. The same ART 2 system as 
used in Figure 6 has here grouped the same 50 inputs into 20 recognition categories. 
Note, for example, that Categories 1 and 2 of Figure 6 are joined in Category 1; 
Categories 14,15, and 32 are joined in Category 10; and Categories 19-22 are 


joined in Category 13. 

the system’s hypothesis testing and search 
of LTM by sequentially engaging the 
novelty-sensitive orienting subsystem. 

Although STM is reset sequentially in 
time by this mismatch-mediated, self¬ 
terminating LTM search process, the 
mechanisms that control the LTM search 
are all parallel network interactions, rather 
than serial algorithms. Such a parallel 
search scheme continuously adjusts itself 
to the system’s evolving LTM codes. The 


LTM code depends on both the system’s 
initial configuration and its unique learn¬ 
ing history, and hence cannot be predicted 
by a prewired search algorithm. Instead, 
the mismatch-mediated engagement of the 
orienting subsystem triggers a process of 
parallel self-adjusting search that tests 
only the hypotheses most likely to succeed, 
given the system’s unique learning history. 

The mismatch-mediated search of LTM 
ends when an STM pattern across F 2 


reads out a top-down expectation that 
approximately matches / (to the degree of 
accuracy required by the level of atten- 
tional vigilance) or that has not yet under¬ 
gone any prior learning. In the former 
case, the accessed recognition code is 
refined based on any novel information 
contained in the input /; that is, based 
upon the activity pattern resonating at fj 
that fuses together bottom-up and top- 
down information according to the 2/3 
Rule. In the latter case, a new recognition 
category is established as a new bottom-up 
code and top-down template are learned. 


ART 2: Learning to 
recognize an analog 
world 

Although self-organized recognition of 
binary patterns is useful in many applica¬ 
tions, such as recognition of printed or 
written text, as in Figure 4, many other 
applications require the ability to catego¬ 
rize arbitrary sequences of analog (includ¬ 
ing binary) input patterns. A class of 
architectures, generically called ART 2, 
has been developed for this purpose. 2 

Given the enhanced capabilities of ART 
2 architectures, a sequence of arbitrary 
input patterns can be fed through an arbi¬ 
trary preprocessor before the output pat¬ 
terns of the preprocessor are fed as inputs 
into an ART 2 system for automatic clas¬ 
sification. Figure 6 illustrates how an ART 
2 architecture has quickly learned to sta¬ 
bly classify 50 analog input patterns, cho¬ 
sen to challenge the architecture in 
multiple ways, into 34 recognition categor¬ 
ies after a single learning trial. Figure 7 
illustrates how the same 50 input patterns 
have been quickly classified into 20 coarser 
categories after a single learning trial, 
using a smaller setting of the vigilance 
parameter. 

ART 2 architectures can autonomously 
classify arbitrary sequences of analog 
input patterns into categories of arbitrary 
coarseness while suppressing arbitrary 
levels of noise. They accomplish this by 
modifying the ART 1 architecture to 
incorporate solutions of several additional 
design problems into their circuitry. In 
particular, level F, is split into separate 
sublevels for receiving bottom-up input 
patterns, for receiving top-down expecta¬ 
tions, and for matching the bottom-up and 
top-down data, as in Figure 8. 

Three versions of the ART 2 architec¬ 
ture are now being applied to problems 
such as visual pattern recognition, speech 
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perception, and radar classification. In 
addition to research on ART undertaken 
at several universities, applications are also 
being developed at government laborato¬ 
ries and industrial firms including the MIT 
Lincoln Laboratory; Booz-Allen and 
Hamilton, Inc.; Hecht-Nielsen Neu- 
rocomputer Corp.; Science Applications 
International Corp.; the U.S. Army 
Research Center at Redstone Arsenal; and 
Wright-Patterson Air Force Base. 


Invariant visual pattern 
recognition 


Researchers from Boston University 
and the MIT Lincoln Laboratory are col¬ 
laborating to carry out an application to 
invariant visual pattern recognition. This 
application uses a three-stage preproces¬ 
sor, summarized in Figure 9. 

First, the image figure to be recognized 
is detached from the image background 
using laser radar sensors. This can be 
accomplished by intersecting the images 
formed by two laser sensors: the image 
formed by a range detector focused at the 
distance of the figure and the image 
formed by another laser detector capable 
of differentiating figure from back¬ 
ground, such as a doppler image when the 
figure is moving or the intensity of laser 
return when the figure is stationary. 13 

The second stage of the preprocessor 
contains a neural network, called a bound¬ 
ary contour system, 3,4 that detects, shar¬ 
pens, regularizes, and completes the 
boundaries within noisy images. 

The third stage of the preprocessor con¬ 
tains a Fourier-Mellin filter, whose output 
spectra are invariant under such image 
transformations as 2D spatial translation, 
dilation, and rotation. 14 

Thus, the input patterns to ART 2 are 
the invariant spectra of completed bound¬ 
ary segmentations of laser radar sensors. 
By setting ART 2 parameters to suppress 
(up to) a prescribed level of input noise and 
to tolerate (up to) a prescribed level of 
input deformation, this system defines a 
compact circuit capable of autonomously 
learning to recognize visual targets that are 
deformed, rotated, dilated, and shifted. 
Although this preprocessor does not pur¬ 
port to provide a biological solution to the 
problem of invariant visual object recog¬ 
nition, we know that the mammalian vis¬ 
ual cortex does carry out computations 
analogous to aspects of the second and 
third stages of this preprocessor. 3,4,15 



Figure 8. A typical ART 2 architecture. Open arrows indicate specific patterned 
inputs to target nodes. Filled arrows indicate nonspecific gain control inputs. The 
gain control nuclei (large filled circles) nonspecifically inhibit target nodes in 
proportion to the Z. 2 -norm of STM activity in their source fields. As in ART 1, gain 
control (not shown) coordinates STM processing with input presentation rate. 


The three R’s: 
Recognition, 
reinforcement, and 
recall 


Recognition is only one of several 
processes whereby an intelligent system 
can learn a correct solution to a problem. 
Reinforcement and recall are no less 
important in designing an autonomous 
intelligent system. 

Reinforcement, notably reward and 
punishment, provides additional informa¬ 
tion in the form of environmental 
feedback based on the success or failure of 
actions triggered by a recognition event. 
Reward and punishment calibrate whether 
the action has or has not satisfied internal 
needs, which in the biological case include 
hunger, thirst, sex, and pain reduction, but 
may in machine applications include a 
wide variety of internal cost functions. 
Reinforcement can modify the formation 
of recognition codes and can shift 
attention to focus upon those codes whose 
activation promises to satisfy internal 
needs based on past experience. For 
example, both green and yellow bananas 
may be recognized as part of a single 
recognition category until reinforcement 
signals, contingent upon eating the 



Figure 9. A three-stage preprocessor for 
the ART 2 system enables input patterns 
that are deformed, shifted, dilated, and 
rotated to be recognized as exemplars of 
the same category. The preprocessor 
passes laser radar images that separate 
figure from background through a 
boundary segmentation network and 
then through a Fourier-Mellin trans¬ 
form. The Fourier-Mellin spectra are 
the inputs to ART 2. 
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Figure 10. A self-organizing architecture for invariant pattern recognition and 
recall that can be expanded, as noted in the text, to include reinforcement mechan¬ 
isms capable of focusing attention upon internally desired classes of external 
events. 


bananas, differentiate them into separate 
categories. 

Recall can generate equivalent responses 
or actions to input events classified by 
different recognition codes. For example, 
printed and script letters might generate 
distinct recognition codes, yet can also 
elicit identical learned naming responses. 

Our own research program during the 
past two decades at Boston University has 
been devoted to discovering and imple¬ 
menting models of self-organizing 
biological systems wherein all the 
ingredients of recognition, reinforcement, 
and recall join together in a single inte¬ 
grated circuit. 3,4 The system depicted in 
Figure 10 provides a framework for imple¬ 
menting some of these circuit designs. In 
particular, as ART 2 self-organizes 
recognition categories in response to the 
preprocessed inputs, its categorical choices 
at the F 2 classifying level self-stabilize 
through time. In examples wherein F 2 
makes a choice, ART 2 can be used as the 
first level of an ART 1 architecture, or yet 
another ART 2 architecture. Let us call the 
classifying level of this latter architecture 


F 3 . Level F 3 can be used as a source of 
pre-wired priming inputs to F 2 . 

Alternatively, as in Figure 10, self- 
stabilizing choices by F 3 can quickly be 
learned in response to the choices made at 
F 2 . Then, F 3 can be used as a source of 
self-organized priming inputs to F 2 , and a 
source of priming patterns can be 
associated with each of the F 3 choices via 
mechanisms of associative pattern learn¬ 
ing. 3 After learning of these primes, turn¬ 
ing on a particular prime can activate a 
learned F 3 -*■ F 2 top-down expectation. 
Then F 2 can be supraliminally activated 
only by an input exemplar which is a mem¬ 
ber of the recognition category of the 
primed F 2 node. 

The architecture ignores all but the 
primed set of input patterns. In other 
words, the prime causes the architecture to 
pay attention only to expected sources of 
input information. Due to the spatial 
invariance properties of the preprocessor, 
the expected input patterns can be trans¬ 
lated, dilated, or rotated in 2D without 
damaging recognition. Due to the similar¬ 
ity grouping properties of ART 2 at a fixed 


level of vigilance, suitable deformations of 
these input patterns, including deforma¬ 
tions due to no more than anticipated 
levels of noise, can also be recognized. 

The output pathways from level F 2 of 
ART 2 to the postprocessor can learn to 
recall any spatial pattern or spatiotem- 
poral pattern of outputs by applying the¬ 
orems about associative learning in a type 
of circuit called an avalanche? In partic¬ 
ular, distinct recognition categories can 
learn to generate identical recall responses. 
Thus, the architecture as a whole can sta¬ 
bly self-organize an invariant recognition 
code and an associative map to an arbi¬ 
trary format of output patterns. 

The interactions (priming -*• ART) and 
(ART -*• postprocessor) in Figure 10 can 
be modified so that output patterns are 
read out only if the input patterns have 
yielded rewards in the past and if the 
machine’s internal needs for these rewards 
have not yet been satisfied. 3,4 In this var¬ 
iation of the architecture, the priming pat¬ 
terns supply motivational signals for 
releasing outputs only if an input exemplar 
from an internally desired recognition cat¬ 
egory is detected. The total circuit forms 
a neural network architecture which can 

• stably self-organize an invariant pat¬ 
tern recognition code in response to a 
sequence of analog or binary input 
patterns 

• be attentionally primed to ignore all 
but a designated category of input patterns 

• automatically shift its prime as it satis¬ 
fies internal criteria in response to actions 
based upon the recognition of a previously 
primed category of input patterns and 

• learn to generate an arbitrary 
spatiotemporal output pattern in response 
to any input pattern exemplar of an acti¬ 
vated recognition category. 

Such circuits, and their real-time adap¬ 
tive autonomous descendents, may prove 
useful in some of the many applications 
where preprogrammed rule-based sys¬ 
tems, and systems requiring external 
teachers not naturally found in the appli¬ 
cations environments, fear to tread. 


Self-stabilization of 
speech perception and 
production 

The insights gleaned from the design of 
ART 2 have also begun to clarify how we 
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can design hierarchical learning systems 
with multiple ART levels. Figure 11 shows 
a hierarchical ART system for learning to 
recognize and produce speech. The system 
self-stabilizes its learning in real time with¬ 
out using a teacher. This ART architecture 
is being developed at Boston University by 
Michael Cohen, Stephen Grossberg, and 
David Stork. Top-down ART expectation 
mechanisms at several levels of the archi¬ 
tecture help to self-stabilize learned codes 
and to self-organize the selection of invar¬ 
iant recognition properties. Of particular 
interest in this speech architecture is the 
role of top-down expectation signals from 
the architecture’s articulatory, or motor, 
system to its auditory, or perception, sys¬ 
tem. These expectations help to explain 
classical results from motor theory, which 
state that speech is perceived in terms of 
how it would have been produced, even 
during passive listening. 

The key insights of the motor theory 
take on new meaning through the self- 
stabilizing properties of top-down 
articulatory-to-auditory expectations. 
These expectations self-stabilize the 
learned imitative associative map that 
transforms the perceptual codes which 
represent heard speech into motor codes 
for generating spoken speech. In so doing, 
the articulatory-to-auditory expectations 
deform the bottom-up auditory STM pat¬ 
terns via 2/3 Rule-like matching into acti¬ 
vation patterns consistent with invariant 
properties of the motor commands. These 
motorically modified STM codes are then 
encoded in long-term memory in a 
bottom-up adaptive filter within the audi¬ 
tory system itself. This bottom-up adap¬ 
tive filter activates a partially compressed 
speech code at the auditory system’s next 
processing level. The motorically modified 
speech code is thus activated during pas¬ 
sive listening as well as during active imi¬ 
tation. 

Psychophysiological 
and neurophysiological 
predictions of ART 

Although applications of ART to com¬ 
puter science depend upon the computa¬ 
tional power of these systems for solving 
real-world problems, ART systems are 
also models of the biological processes 
whose analysis led to their discovery. In 
fact, in addition to suggesting mechanis¬ 
tic explanations of many interdisciplinary 
data about the mind and brain, the theory 
has also made a number of predictions 
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Perception Motor 



Figure 11. Schematic of some processing stages in an architecture for a self¬ 
organizing speech perception and production system. The left-hand side of the fig¬ 
ure depicts five stages of the auditory model; the right-hand side depicts four stages 
of the motor model. The pathways from the partially compressed auditory code to 
the motor system learn an imitative associative map which joins auditory feedback 
patterns to the motor commands that generated them. These motor commands are 
compressed via bottom-up and top-down adaptive filters within the motor system 
into motor synergies. The synergies read out top-down learned articulatory-to- 
auditory expectations, which select the motorically consistent auditory data for 
incorporation into the learned speech codes of the auditory system. 


partially supported by experiments. For 
example, in 1976, it was predicted that 
both norepinephrine (NE) mechanisms 
and attentional mechanisms modulate the 
adaptive development of thalamocortical 
visual feature detectors. In 1976 and 1978, 
Kasamatsu and Pettigrew described NE 


modulation of feature detector develop¬ 
ment, and Wolf Singer reported atten¬ 
tional modulation in 1982. In 1978, a word 
length effect in word recognition 
paradigms was predicted. In 1982 and 
1983, Samuel, van Santen, and Johnston 
reported a word length effect in word 
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superiority experiments. In 1978 and 1980, 
a hippocampal generator of the P300 
event-related potential was predicted. In 
1980, Halgren and his colleagues reported 
the existence of a hippocampal P300 
generator in humans. The existence and 
correlations between other event-related 
potentials, such as processing negativity 
(PN), early positive wave (PI20), and 
N200 were also predicted in these theoret¬ 
ical articles. These predictions and suppor¬ 
tive data are described in several recent 
books. 3,4 

T hus, ART systems provide a fer¬ 
tile ground for gaining a new 
understanding of biological intel- 
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ligence. They also suggest novel computa¬ 
tional theories and real-time adaptive 
neural network architectures with promis¬ 
ing properties for tackling some of the out¬ 
standing problems in computer science 
and technology today. □ 
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Computing with Structured 
Neural Networks 

Jerome A. Feldman, Mark A. Fanty, and Nigel H. Goddard 
University of Rochester 


R apid advances in the neuro¬ 
sciences and in computer science 
are leading to renewed interest in 
computational models linking animal 
brains and behavior. The idea of looking 
directly at massively parallel realizations 
of intelligent activity promises to be fruit¬ 
ful for the study of both natural and arti¬ 
ficial computation. Much attention has 
been directed towards the biological impli¬ 
cations of this interdisciplinary effort, but 
there are equally important relations with 
computational theory, hardware, and 
software. 

Recent work on neural network compu¬ 
tation, much of it carried out by physi¬ 
cists, 1 examines the emergent behavior of 
large, unstructured collections of comput¬ 
ing units. We are more concerned with 
how one can design, realize, and analyze 
networks that embody the specific com¬ 
putational structures needed to solve hard 
problems. In this article, we focus on the 
design and use of massively parallel com¬ 
putational models, particularly in artificial 
intelligence. We also describe a computing 
environment for working with structured 
networks and present some sample appli¬ 
cations. Throughout, we treat adaptation 
and learning as ways to improve structured 
networks, not as replacements for analy¬ 
sis and design. 


Designing and 
debugging massively 
parallel, connectionist 
systems will require 
appropriate tools. One 
such package is now 
in use at Rochester 
and other laboratories. 


Opportunities and 
limitations 

Computationally, animal brains are 
remarkable machines with properties rad¬ 
ically different from those of conventional 
computers. Our research starts from the 
assumption that an abstract computer 
based on the computational properties of 
animal brains may prove particularly well- 


I suited for problems in vision, locomotion, 

| and language understanding. By taking 
seriously the computational constraints 
faced by nature and the structure of tasks 
known from artificial intelligence and 
other disciplines, we hope to discover 
algorithms employed by animals that 
might be effective for machines. 

Even a crude analysis of neural compu¬ 
tation reveals several major constraints. 
When asked to carry out any of a wide 
range of tasks, such as naming a picture or 
deciding if some sound is an English noun, 
people can respond correctly in about a 
half-second. The human brain, a device 
composed of neural elements having a 
basic computing speed of a few millise¬ 
conds, can solve such problems of vision 
and language in a few hundred millise¬ 
conds, that is, in about a hundred time 
steps. The best AI programs for these tasks 
are not nearly as general and require mil¬ 
lions of computational time steps. This 
hundred-step-rule is a major constraint on 
any computational model of behavior. 
The same timing considerations show that 
the amount of information sent from one 
neuron to another is very small, a few bits 
at most. The range of spike frequencies is 
limited and the system too noisy for deli¬ 
cate phase encodings to be functional. This 
means that complex structures are not 
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Structured Connectionist Models 


Figure 1. Two approaches to artificial intelligence. 


transmitted directly and, if present, must 
be encoded in some way. Since the critical 
information must be captured in the con¬ 
nections, this is called a “connectionist” 
model. 

The number of neurons in the human 
brain, estimated at about 10 u , presents 
serious constraints. For one thing, the 
number of cells in each retinal ganglion is 
greater than 10 6 , which automatically 
rules out any vision algorithm requiring 
N 2 units since there could not be a sepa¬ 
rate unit for each possible line joining two 
points on the retina. Many attractive 
algorithms for higher level tasks run afoul 
of the size constraint, and considerable 
work has gone into what amounts to cod¬ 
ing tricks to circumvent this constraint. 2,3 

One might think that biological con¬ 
straints need not be taken seriously in 
designing neural-net hardware and soft¬ 
ware. Certainly, electronic switching times 
are a million times faster then neural ones. 
But the size constraint and particularly the 
connectivity constraint are much more 
serious for artificial systems. Animal 
brains derive much of their power from 
their very large fan-in and fan-out: each 
unit can be connected to thousands of 


others. For conventional chips, a fan-out 
of six is quite unusual. In software simu¬ 
lations like the ones described in this arti¬ 
cle, excessive connectivity leads to serious 
performance problems. The massively 
parallel computational solutions evolved 
by nature will, no doubt, prove useful to 
computer science and engineering, but not 
without modification. Connectionist com¬ 
puter research aims to understand the prin¬ 
ciples of massively parallel computation 
and to apply them to appropriate tasks. 

Connectionist models can be viewed as 
synthesizing two traditionally opposed 
approaches to artificial intelligence. 4 (See 
Figure 1.) Some early AI investigators 
focused on the parallelism, robustness, 
and plasticity of animal brains and 
explored ways of automatically generating 
high performance. The other group con¬ 
centrated on the detailed structure of tasks 
and algorithms and expressed them in con¬ 
ventional computer notation. Structured 
connectionist models attempt to capture 
the best of both paradigms. For example, 
the network fragment of Figure 2 encodes 
a conventional semantic network and 
inference structure in a parallel, evidential 
realization. Efforts like this one by 


Shastri 5 involve all the analysis and design 
issues of both AI and network theory. 
Recent developments in hardware and in 
connectionist learning theories 6 have led 
some people to believe that all this might 
not be necessary and that uniform learn¬ 
ing mechanisms will suffice for all the 
problems of interest. This is a seductive 
idea, but there are very good reasons— 
some of which are outlined below—for 
believing that it cannot work. 

Realistic expectations. The current 
explosion of interest in neural networks 
(connectionist models, etc.) is based on a 
number of scientific and economic expec¬ 
tations, some of which are unreasonable. 
We can be quite sure that neural networks 
will not replace conventional computers, 
eliminate programming, or unravel the 
mysteries of the mind. We can expect bet¬ 
ter understanding of massively parallel 
computation to have an important role in 
practical tasks and in the behavioral and 
brain sciences, but only through interac¬ 
tion with other approaches to these prob¬ 
lems. As always, specific structures of 
problems, disciplines, and computational 
systems are the cornerstone of success. The 
main hope of massively parallel (neural 
network) research is that it will provide a 
better basis for such efforts. 

One particularly simplistic view of neu¬ 
ral networks is that they will support intel¬ 
ligence by implementing a holographic- 
style memory. The basic problems with 
any holographic representational scheme 
are cross-talk, communication, invari¬ 
ance, and the inability to capture structure. 
Essentially the same problems have 
prevented the development of holographic 
computer memories or recognition sys¬ 
tems despite considerable effort. Consider 
the problem of representing the concept 
grandmother as a pattern of activity of all 
the units in some memory network. Notice 
what happens if two (or more) concepts 
are presented at the same time—for exam¬ 
ple, grandmother at the White House. 
Obviously, if every single unit must have 
a specified value for the network to cap¬ 
ture grandmother, then no other concept 
can be active at the same time. If each con¬ 
cept is spread over some fraction of the 
units, the chances are high that the encod¬ 
ings will overlap and cause confusion. One 
can reduce the probability of cross-talk by 
having fewer units active for each pattern. 
Assuming that the cross-talk is randomly 
distributed, Willshaw 7 shows that the sys¬ 
tem with many concepts will be reliable 
(even for single entries) only if the number 
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Figure 2. Interaction between a knowledge network and a routine. 


of units active for each pattern is propor¬ 
tional to the logarithm of the number of 
units in the diffuse memory. This means 
that a network of 1,000,000 units should 
use an encoding with about 20 active units 
per concept—essentially a localized repre¬ 
sentation. 

A related problem with diffuse 
representations is that only one concept at 
a time can be transmitted between sub¬ 
systems, if each concept is a pattern on the 


whole bus. The sequential nature of dif¬ 
fuse representations is particularly 
troublesome when we consider how infor¬ 
mation about a complex scene can be 
transferred from vision to other systems 
such as language and motor control. There 
appears no alternative to assuming that, at 
least for simultaneous communication, 
representation of concepts must be largely 
disjoint and thus compact. 

The same communication problem 


arises within the concept memory itself if 
one tries to build Figure 2’s knowledge 
structure with a diffuse representation. If 
a concept like ‘ ‘salty” is represented only 
by a large pattern, the links for this entire 
pattern must go to all places related to 
saltiness—and be treated correctly at each 
of them. If a concept encoded by N units 
must be linked to Mother concepts, a total 
of M»/Vlinks will be needed. The more dis¬ 
tributed the representation, the more seri- 
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ous this problem becomes. Again, any 
serious reduction in this wiring require¬ 
ment constitutes a compact representa¬ 
tion. And, as in the intermodal case, unless 
these representations are largely disjoint, 
concept processing must be sequential to 
avoid cross-talk. This eliminates the mas¬ 
sively parallel processing required by the 
100-step constraint. 

Moreover, no one has suggested how to 
represent any but the simplest concepts in 
the holographic style. In fact, it is far from 
obvious how to make the same invariant 
distributed pattern active for alternative 
views of a chair. All proposed holographic 
structures are totally without internal 
structure. Another idea would associate 
the representation’s components with 
microfeatures, but these are still unstruc¬ 
tured. Nor do any of the holographic 
proposals provide a way of answering even 
simple questions like the color of grand¬ 
mother’s hair. The only general suggestion 
on how to encode structured knowledge in 
a holographic system is to encode each 
proposition (for example, “Dave likes 
candy”) as a separate memory. 8 One can 
build up arbitrary structures in this way, 
but at the cost of losing all the advantages 
that led to connectionist models in the first 
place. The possible technical advantages 
of exploring such models have nothing to 
do with human concept memory—for 
example, they violate the 100-step con¬ 
straint by many orders of magnitude. 

Another argument for highly dis¬ 
tributed representations derives from the 
large number of input fibers—about 
10 4 —to cortical neurons. If all these 
fibers participate actively, then, by its very 
nature, the representation is diffuse. While 
there has been no definitive study on the 
number of presynaptic events required for 
neural firing, estimates gleaned from 
papers and conversations run from one 
event to a few dozen. 9 No one has sug¬ 
gested that several thousand synapses must 
fire at once for an action potential. Also, 
many of the connections could represent 
alternative ways of activating the same 
concept (for example, from different 
points in space). Another way of looking 
at this is that the thousand-fold connec¬ 
tivity is capturing an OR of activation con¬ 
ditions rather than an AND. Finally, 
learning in a connectionist system requires 
the potential for many more connections 
than are ever made functional. 

All of this may seem to be flogging a 
dead-horse model, but purely holographic 
theories continue to be seriously 
proposed—for example, the recent flurry 


of interest in spin-glasses as holographic 
memory models in the theoretical physics 
community. When addressing specific 
problems, Hopfield and Tank 1 use highly 
structured compact representations, but in 
general speculations they prefer holo¬ 
graphic style models. Any physical system 
will, in isolation, reach some stable state, 
and each of these states can be looked on 
as encoding a different concept— 
obviously, a massively parallel system. 

A more sophisticated version of the 
universal neural-net hope is that new 
learning techniques will induce the 
required structure in an initially uniform 
or random network. There has been signif¬ 
icant recent progress in connectionist 
learning, 6 but any nontrivial neural model 
also requires a great deal of prior structure. 
For example, the primate visual system has 
at least a dozen subsystems, each with 
elaborate internal and external connection 
structure. 9 Any notion that a general 
learning scheme will obviate the need for 
neuroscience, psychophysics, perceptual 
psychology, and computer vision research 
disappears as soon as one takes the vision 
problem seriously, and there is no reason 
to believe that language, problem solving, 
etc., are simpler or less structured. For 
general computation, Judd 10 has shown 
that even the problem of learning weights 
to memorize a lookup table is NP- 
complete and thus intractable. This 
implies that our formalisms, simulators, 
and neurocomputers must support both 
complex structure specification and 
dynamic weight change. Programming 
such systems, understanding their 
behavior, and controlling how they adapt 
are major continuing concerns. 

The role of specialized hardware. In 

moving from science to technology, the 
issues change somewhat. Physical limits 
on computation speed are forcing a move 
to parallel systems, but not necessarily to 
massively parallel neural-style machines. 
As is well-known, current computer ele¬ 
ments are about a million times faster than 
neurons, but neurons have a thousand¬ 
fold greater connectivity. These and other 
basic differences between electronic and 
living systems suggest that radically differ¬ 
ent architectures and algorithms might be 
appropriate in the two cases. Or maybe 
not; it is simply too early to tell. We do 
know that neural models can be effectively 
simulated on essentially any computer and 
are among the few computations that can 
automatically exploit an arbitrary amount 
of parallelism. 11,12 For the foreseeable 


future, the bulk of neural network 
research will (and should) be carried out 
with simulations on conventional com¬ 
puters. Building general-purpose neu¬ 
rocomputers is, at best, premature. 

The interesting question concerns the 
long-term future of massively parallel sys¬ 
tems in computation. A remarkable fact 
about computer architecture is its essential 
sameness for the vast majority of existing 
systems. Special-purpose architectures 
have had a negligible impact, and this 
should be a caution against neuro¬ 
euphoria. Moreover, basic reasons virtu¬ 
ally preclude neurocomputers replacing 
conventional ones for most tasks. The 
main reason is the greatly lower cost of 
passive versus active elements. Many exist¬ 
ing tasks, such as word processing, can be 
done adequately by a very small number 
of (very fast) active elements. More gener¬ 
ally, one can view the development of writ¬ 
ing, of mathematics, and later of comput¬ 
ers as ways of complementing the natural 
capabilities of neural-style representation 
and computation. 

If they are not going to replace conven¬ 
tional machines, what future is there for 
neurocomputers? One possibility is that 
calculations of physical systems will be 
expressed best by massively parallel net¬ 
works, more or less directly simulating the 
situation of concern. Some low-level sig¬ 
nal processing might best be done by par¬ 
allel analog or hybrid networks. 13 These 
seem quite plausible, but are a small 
(though important) part of computation. 
The best hope for widespread use of neu¬ 
rocomputers is, unsurprisingly, in com¬ 
putationally intensive areas not 
successfully attacked with conventional 
systems. The obvious ones are those that 
require human-like capabilities in percep¬ 
tion, language, and inference—the tradi¬ 
tional concerns of artificial intelligence. If 
such efforts succeed, there will be related 
applications in real-time control of com¬ 
plex systems and others we can’t anticipate 
now. 

If the practical future of neurocom¬ 
puters depends on intelligent activity, we 
have come full circle—back to the scien¬ 
tific issues. The critical question is do some 
features of massively parallel computation 
make it uniquely suited for calculations 
associated with intelligence? The shortest 
path to an answer may be the indirect one 
of exploring connectionist models of intel¬ 
ligence. 2,3 If neural-style computation has 
unique advantages, these should appear in 
the higher level specifications of percep¬ 
tion, language understanding, and the 


94 


COMPUTER 





B-CLOSER CUBE . 
H-CLOSER CUBE | 



MANY 

WHAT 


State 
Link/In 
from -It 


Output 

Data 

Link/out 
to: 1000 


HOW □ O A O O * 

:j?i name: 


WHERE start x: 282 y: 289 

space x: 5 y: 5 

units per row: max 


Clock= 48 0r1g1n= 17 -113 


( —pr—v number steps: 18 
l - ■> update steps: 18 


VERTEX ORIENTATION 


show Info | wove | wark target 

:: out 37 88 39 80 


[ DUMP ) 

[RESHOWj 


: gl .Image 
: 17 -113 


-> A 


l QUIT 1 


Figure 3. Networks for the Necker Cube. 


like. If the conventional symbolic formal¬ 
isms suffice at these levels, it is exceedingly 
unlikely that neurocomputers will be 
needed. Of course, another consequence 
of the convergence of scientific and tech¬ 
nological goals is a shared interest in the 
detailed structure and function of natural 
intelligence. 

When the faith in all-encompassing neu¬ 
ral networks is abandoned, we are left with 
a complex set of interacting scientific and 
engineering issues. Rather than causing the 
demise of conventional computer science 
and engineering, neural networks present 
a wide range of new problems and oppor¬ 
tunities. Thus, if we are going to design 
and debug neural networks, an appropri¬ 
ate set of tools could be very helpful. One 
such package, described below, has been 


in use for some time at Rochester and at 
scores of other laboratories. 


Simulation 

environment 

We can capture some of the flavor of 
connectionist computation with a simple 
example. The cube shown in Figure 3 is a 
famous optical illusion originally due to 
the Swiss naturalist, L.A. Necker (1832). 
Most people initially see the cube with the 
vertex B closer to them, but it also can be 
seen as a cube with vertex H closest to the 
observer. If you focus on vertex H and 
imagine it coming out of the paper toward 
you, the picture will flip to the H-closer 


cube. Notice also that the flip takes less 
than a second. The Necker cube is interest¬ 
ing to psychologists because it flips spon¬ 
taneously between the two views if you 
keep looking at it. To us, it is interesting 
because of what it reveals about parallel 
computation. 

You have observed how quickly the 
Necker cube flips state and know how slow 
the underlying human computing elements 
are. It seems unlikely that a sequential pro¬ 
gram on such a slow device could do the 
job. But the situation is much more com¬ 
plex. We know that both human and com¬ 
puter vision require several levels of 
processing. Typical levels include edge seg¬ 
ments, lines, vertices, faces, and object 
descriptions. The edges and lines are the 
same for both the H-closer and B-closer 
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^define FRONT = 0 BACK = 1 DEPTH = 2-literal constants 


CreateUnits () 

{ 

for i = 0 to 25 

j = MAKEUNIT(“node, ’’UFasymp) 
ADDSITE(j‘ excite, ’’SFweightedsum) 
NAMEUNIT(“ VIEW1, ’ ’ARRAY,0,4,3) 
NAMEUNIT(“B-CLOSER, ’’SCALAR, 12,0,0) 
NAMEUNIT(“ VIEW2, ’ ’ARRAY ,13,4,3) 
NAMEUNIT(“H-CLOSER, ’’SCALAR,25,2,0) 

} 


- make units and sites 


- asymptotic function 

- weighted sum function 

- view B units 

- view B top level 

- view H units 

- view H top level 


Createlnhibition 0 


- make inhibit links 


} 


for i = 0 to 7 BiLink(i,i + 13, - 575) - link opposing vertex 

for i = 8 to 11 BiLink(i,i + 13, - 800) - and depth units 

BiLink(INDEX(“B-CLOSER”),INDEX(“H-CLOSER”), - 1000) 

- link top-level units 


CreateExcite (front.back,depth,cube) 

{ 

for i = 0 to 3 

BiLink(front + i,front + ((i + 1)%4),200) 
BiLink(back + i.back + ((i + 1)%4),200) 
BiLink(depth + i,depth + ((i + 1)%4),300) 
BiLink(front + i.back + i,200) 
BiLink(front + i,depth + i,300) 
BiLinkfback + i,depth + i,300) 
BiLinkfdepth + i,cube,500) 


- make excitatory links 

- for one view 

- around front face 

- around back face 

- between depth units 

- between faces 

- between vertex and 

- depth units 

- from depth to cube 

- and cube to depth 


BiLink (firstunit, secondunit,weight) 

} 


- make symmetric links 

- no link function 


MAKELINK(firstunit, secondunit,” excite, ’’weight,NULL) 
MAKELINKfsecondunit,firstunit,' ‘ excite, ’’weight,NULL) 

} - link top-level units 


buildQ 


- build function 


CreateUnitsO - make units 

CreateInhibition() - make inhibit links 

CreateExcite(INDEX(“ VIEW 1, ’’FRONT),INDEX(“ VIEW 1, ’’BACK), 
INDEX(“VIEW 1,’’DEPTH),INDEX(“B-CLOSER’’)) - VIEW1 links 

CreateExcite(INDEX(“VIEW2, ’’FRONT), INDEX(“VIEW2, ’’BACK), 
INDEX(“VIEW2,’’DEPTH),INDEX(“H-CLOSER”)) - VIEW2 links 

} 


Figure 4. Code for setting up the Necker cube of Figure 3. Primitives are in uppercase. 


cubes, but many other visual features are 
seen differently in the two views. For 
example, vertex C is oriented into the 
paper in the B-closer reading, but out of 
the paper in the other reading. Similarly, 
C is closer than G in the B-closer reading, 
and all these perceptions are mutually con¬ 


sistent and reinforcing. The remarkable 
fact is that our visual system simultane¬ 
ously flips all these perceptual decisions 
from one mutually consistent reading of 
the cube to the other. This illustrates the 
key cooperative property of massively par¬ 
allel computation and its conceptual 


difference from Von Neumann computa¬ 
tion on standard machines. 

Figure 3 also illustrates some details of 
the connectionist paradigm. In our 
models, each item of interest is represented 
as a computational unit with connections 
to many other units. Each unit has a level 
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of activity (here between - 100 and 100) 
and automatically sends the value of this 
activity along all its outgoing connections. 
For example, the top double arrow in Fig¬ 
ure 3 depicts the fact that the unit 
representing the H-closer cube has positive 
two-way links with the four relative dis¬ 
tance detectors: A < E, etc. Each unit at 
each level in this network has a rival to 
which it is connected by a mutual inhibi¬ 
tion link. The strengths of links into a par¬ 
ticular unit can also be displayed. The only 
other information needed for a complete 
model is the rule by which a unit computes 
its new activity from its inputs and its old 
activity; the simulator described in this sec¬ 
tion allows a wide variety of update 
algorithms. We can assume for now that 
the units compute the sum of their positive 
and negative inputs. Networks like Figures 
2 and 3 are not very sensitive to the exact 
choice of unit computation rules; this is 
one of the reasons for their attractiveness. 
Units that are all mutually connected by 
negative links are said to comprise a 
winner-take-all network. Such networks 
are one of the main decision mechanisms 
in connectionist models and have known 
neurophysiological analogs. 

Much of the effort in massively paral¬ 
lel AI is dedicated to using computational 
frameworks like that in Figures 2 and 3 to 
build models of intelligent activity. 
Advantages of this approach include its 
link to natural intelligence, increased noise 
resistance, and ease of implementation on 
parallel hardware. But the main advantage 
of the connectionist approach is that it 
provides a much better way of specifying 
some computations. No alternative way 
appears to describe the Necker cube 
phenomenon nearly so clearly and con¬ 
cisely as Figure 3. 

Specifying and simulating networks. 

Researchers experimenting with structured 
connectionist networks must be able to 
implement and test their ideas. Since our 
network models presume much structure, 
and network architecture design is a major 
component in the research effort, it is 
necessary to develop a network description 
language and a simulation system that sup¬ 
ports varied architectures. Different 
researchers working in diverse areas need 
the ability to build and simulate radically 
different networks. Three of the most 
important aspects that may differ are con¬ 
nection topology, activation functions, 
and the amount of data associated with 
each network node. 

Connectionist networks consist of sim¬ 


ple computational elements (units) that 
communicate by sending their level of acti¬ 
vation via links to other elements. The 
units have a small number of states and 
compute simple functions of their inputs. 
Associated with each link is a weight, 
indicating the “significance” of activation 
arriving over that link. The pattern of con¬ 
nections, the weights on the links, and the 
unit functions determine network 
behavior. 

The Rochester Connectionist Simulator 
has evolved from simple beginnings into a 
sophisticated research tool that supports 
construction and simulation of a wide vari¬ 
ety of networks. The main design criterion 
has been flexibility. Each unit can compute 
a different function and have any amount 
of associated data; an arbitrary connection 
pattern may be specified. This flexibility 
exacts its cost in time and space. Com¬ 
pared to special-purpose simulators, it is 
time-expensive because each unit and link 
is simulated by a separate function call, 
and space-expensive because each unit and 
link must have an explicit representation. 

The network paradigm supported by the 
simulator is quite general. Each unit has a 
number of sites at which incoming links 
arrive. The provision of sites allows 
differential treatment of inputs, since the 
links themselves do not indicate their ori¬ 
gin at the destination site. Associated with 
each unit are various pieces of data, 
including potential, output, and state. The 
potential corresponds to the unit’s level of 
activation, the output is transmitted along 
all links emanating from the unit, and the 
state is used to make simple decisions 
about how to interpret the unit. Associated 
with each link, site, and unit is a function 
representing its action. 

Network specification. The construc¬ 
tion environment provides a set of primi¬ 
tives for specifying a network. Typical 
primitives make a unit or link, or name a 
unit or array of units. The primitives pro¬ 
vide a conceptual structure through which 
networks may be specified. Rather than 
define an entirely new language for speci¬ 
fication, we use an existing programming 
language, augmented by the primitives. 

A network is built in the simulator by a 
user program written in C. The primitives 
are implemented as library functions, 
called from the user program. The speci¬ 
fication parameters for units, sites, and 
links include initial data values and an acti¬ 
vation function. These activation func¬ 
tions may be different in every case, either 
written by the user or supplied from a 


library. Within each unit, site, and link, a 
general-purpose data field exists, which 
can be used to point to an arbitrarily large 
structure. When displaying, saving, or 
reloading the network, user-supplied func¬ 
tions are called by the simulator to handle 
these user-defined structures. 

The large library of functions facilitates 
the construction task by supplying many 
of the commonly used unit, site, and link 
functions, as well as functions to create 
particular network structures. Researchers 
may add their own unit and structuring 
functions to the library, thus augmenting 
the simulator’s utility for themselves and 
others. An example is a library that greatly 
eases the construction of back- 
propagation networks, as either stand¬ 
alone networks or submodules within a 
larger network. 14 This library allows arbi¬ 
trary error propagation functions as well 
as any unit activation function. 

An important consideration in specify¬ 
ing a network is the ability to give descrip¬ 
tions at different levels of abstraction. At 
the lowest level, the unit, site, and link 
functions give single-unit descriptions. At 
the next level, a set of functions, which 
specify the links and groups of units, 
describe the pattern of connectivity. This 
may include modularity within the net¬ 
work, and for any large network necessar¬ 
ily reflects regularity. At a still higher level, 
network specifications given in user- 
defined languages may be read in and com¬ 
piled into units and links by user-supplied 
functions. 5 

A sample construction program. For an 

example of a network and its specification 
program, let’s look at the Necker cube net¬ 
work in Figure 3. It consists of 26 units — 
eight pairs for the vertices, four relation 
pairs, and the decision units. The units cor¬ 
responding to the two views are displayed 
as solid-line squares and dotted-line 
squares, respectively. The figure shows the 
situation when the H-closer view has 
almost been recognized. Most of the units 
consistent with this view are near maxi¬ 
mum activation, and those of the other 
view at minimum. 

The vertex units represent local three- 
dimensional orientation information. The 
relation units represent pairwise con¬ 
straints between corresponding vertices, 
and the cube units represent the two pos¬ 
sible interpretations. The links to or from 
a unit can be displayed with the graphics 
interface, which was used to generate Fig¬ 
ure 3. 

Figure 4 shows a specification program 
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for this network, written in pseudocode. 
As can be seen, the structure in the net¬ 
work is reflected in the structure of the 
specification program. Units 0 to 8 repre¬ 
sent the B-closer cube’s vertex nodes, 
ABCDEFGH in order. Units 8 to 11 rep¬ 
resent the depth level, A < E, B < F, C < G, 
and D < H in order. Unit 12 is the B-closer 
unit. Units 13 to 25 are encoded in a simi¬ 
lar manner for the H-closer network. The 
linking specifications implement this 
choice of encoding. (See Feldman et al. 12 
for details.) 

Simulation. Once a network has been 
constructed, the simulation, run either 
synchronously or asynchronously, can 
begin. During synchronous simulation, all 
units use the output values computed dur¬ 
ing the previous step as their input. The 
order of simulation is unimportant since 
the network behaves as though all units 
update simultaneously. During asyn¬ 
chronous simulation, a fraction of the 
units are updated at each step in pseu¬ 
dorandom order, and the new output 
value is immediately transmitted to the 
other units. Synchronous simulation is 
easier to understand and is marginally 
faster than asynchronous, but in cases like 
the Necker cube example, synchronous 
updating with deterministic unit functions 
always leads to the same outcome. In other 
cases, synchronous simulation can lead to 
oscillations or other problems of failing to 
break symmetry. 

A command interface controls the simu¬ 
lation and modifies the network during 
simulation. The interface allows execution 
of any user function and of a preloaded set 
of commands. One current trend in our 
simulation environment is toward a more 
interactive system for network specifica¬ 
tion. Dynamic code-loading capabilities 
allow runtime redefinition of activation 
functions and reconstruction of parts of 
the network. Integration with the Com¬ 
mon Lisp and Scheme environments pro¬ 
vides the added expressive power of those 
languages for network specification. 

Performance. Two major performance 
issues in neural network simulation are size 
and performance. On a SUN3/260, a net¬ 
work of 2000 units (computing the 
weighted sum of their inputs), each with 
100 links, took 25 seconds to construct and 
performed 100 simulation steps in 83 
seconds, giving an interconnect time of 
approximately four microseconds. The 
number of links in a network is a crude but 
effective measure of its size; on the 


SUN3/260 with eight megabytes of mem¬ 
ory, the maximum number of links 
attainable before thrashing sets in is of the 
order of a quarter of a million. The paral¬ 
lel simulator, discussed below and in Feld¬ 
man et al., 12 can do much better. 

While space and time efficiency of the 
simulation engine is crucial for experi¬ 
ments with large networks, clearly, ease of 
use, extensibility, and flexibility are 
equally important attributes. Future 
needs, such as large networks running on 
parallel hardware, will require fast simu¬ 
lation systems connected to a high-speed 
graphics interface. 

Graphics interface. The current 
graphics interface, a first-generation tool 
developed by Kenton Lynne, has proved 
indispensable for displaying network 
information during simulation and for 
aiding the network debugging process. It 
allows the user who has created a network 
with the simulator to graphically view and 
examine that network before, during, and 
after execution. Each significant aspect of 
each unit in the network can be displayed 
as a separate graphic object (or icon) 
whose size, shape, or shading varies with 
the current value of that aspect. As the 
simulation runs, the icons are constantly 
updated to reflect changing values, giving 
an overall view of what the network is 
doing. In keeping with the philosophy of 
the simulator, the graphics interface was 
designed to give maximum flexibility to the 
user in how the network is displayed. 

Lynne wrote the graphics interface spe¬ 
cifically to run on our Sun workstations 
using Sun Microsystems’ graphics tool 
package. He designed it as a separate part 
of the simulator package; the user speci¬ 
fies it as an option when the network is 
compiled into object code. If included, the 
graphics interface code automatically cre¬ 
ates its own window, which layers itself 
between the user and the simulator. The 
user then interacts with the simulator via 
the graphics interface window, which 
either executes user-given graphics com¬ 
mands or passes appropriate commands to 
the simulator. It provides and maintains a 
graphics panel, which displays the graphi¬ 
cal representation of the user’s network as 
simulation proceeds. Figure 3 shows the 
graphics interface tool in potential mode 
examining the activations for the Necker 
cube example. The simulator is unaware of 
the graphics interface and, in fact, still 
generates text output to its own window, 
just as though the user were interfacing to 
it directly. 


In addition to displaying and running 
the network, the graphics interface has 
facilities that allow the user to 

• display detailed textual information 
for specific units, 

• show network topology, 

• write the graphics display to a raster 
file, 

• put text and line drawings on the dis¬ 
play for documentation purposes, 

• log the simulator and graphics com¬ 
mands for later replay, and 

• map the mouse buttons to execute 
simulator or graphics commands. 

There is also a version that communicates 
with the parallel connectionist simulator 
running on the department’s BBN Butter¬ 
fly multiprocessor. 

Through commands or mouse actions, 
the user specifies exactly where the icons 
are to appear in the display space, which 
is effectively an unbounded Cartesian 
plane. The graphics panel, which can be 
stretched vertically and horizontally, 
always shows some finite rectangle of the 
display space and can be moved via com¬ 
mands or mouse actions to show different 
parts of the display space. Thus, the size 
of a displayable network is limited only by 
machine memory. (Each displayed aspect 
requires 48 additional bytes of memory). 
The user can also add text and line draw¬ 
ings to the display space to document the 
network as shown in Figure 3. All objects 
(icons, text, drawings) appearing on the 
graphics display can be freely moved 
around with commands or the mouse as 
dictated by taste or function. 

Using the graphics interface signifi¬ 
cantly reduces the time needed to get a con¬ 
nectionist network up and running. With 
the proper unit aspects displayed, one can 
quickly determine if a network is working 
properly and if not, where the problem 
lies. Much of the power of the interface lies 
in its dynamic properties, which do not 
show up in static pictures. For example, 
one can catch oscillations, which can para¬ 
lyze a network, almost immediately. The 
graphics interface makes it much easier to 
specify and debug the large, structured 
networks we work with at Rochester. 

Parallel implementation. We have also 
implemented the connectionist simulator 
on a BBN Butterfly multiprocessor. The 
parallel simulator looks and functions 
much like the uniprocessor simulator. 
Code can be ported with little modifica¬ 
tion, as long as it does not directly access 
the network data structures. If the user is 
content with a naive network partition, he 
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Figure 5. Tinker Toy low-level processing: (a) original figure; (b) response to Kirsch operator; (c) extracted lines; (d) extracted 
circles; (e) final symbolic representation, including connections detected between parts. 


can ignore the fact that the simulator is 
actually running on several processors. 

Using the Butterfly multiprocessor sig¬ 
nificantly increases the speed and capac¬ 
ity of network simulations. Our largest 
Butterfly configuration is 120 processors 
with a total of 120 megabytes of memory. 
It easily runs networks that do not begin 
to fit on any of our uniprocessor machines 
and achieves nearly linear speedup. 12 
Even with smaller networks, simulations 
that would run for hours on a uniprocessor 
take only minutes. 

Sample applications 

The driving force behind the system 
developments described above has been 
applications of connectionist models, par¬ 
ticularly to problems in artificial intelli¬ 
gence. Since all connectionist networks are 
currently simulated on inappropriate 
hardware, applications refer to concept 
demonstrations and scientific models, not 
to programs for practical use. The 
literature 2,3 and this issue of Computer 


describe many such applications. We will 
focus on some recent Rochester work that 
illustrates the structured connectionist 
style and the use of the simulator. 

The first example is a rather ambitious 
vision project carried out by Paul Cooper 
and Susan Hollbach 15 with help from 
Steven Whitehead and others. The basic 
problem is to match real images of Tinker 
Toy objects to stored topological models. 
This is a simplified model of how a neural 
style network could perform visual recog¬ 
nition in the required 100 time steps. In 
previous research, Cooper and Hollbach 
established that the early stages of vision 
can be done effectively in parallel; the 
problem is how to do high-level matching 
to object models. The low-level vision 
preprocessing for this is depicted in Figure 
5. Figure 5a shows a digitized picture of a 
Tinker Toy horse, the input to the system. 
Figure 5b represents the magnitude of the 
gradient produced by the Kirsch operators 
(that is, the edge picture.) Figures 5c and 
5d have the straight lines and circles found 
by the Hough technique from the edge 


information of 5b. Figure 5e demonstrates 
the connectivities found between rods and 
disks: the small circles represent joints 
between a rod and disk. 

Figure 6 gives a typical model database. 
The idea is to match the analyzed input to 
all the models in parallel using a structured 
connectionist network. Figure 7 shows a 
model horse, labeled by letters, and a pos¬ 
sible input image, labeled by numbers. The 
matching network tries to match simul¬ 
taneously compatible wheels (ones with 
the same number of rods) and to ensure 
that adjacent wheels in the image match 
adjacent model wheels. These mutual con¬ 
straints are adequate to select the correct 
model for a wide range of tasks. Figure 8 
shows some of the consistency constraints; 
for example, if the image pair 3-4 matches 
the model pair B-C, then 3 must match B 
or C. The details of the model are moder¬ 
ately complex, involving several winner- 
take-all networks and simultaneous com¬ 
parison of all models. 

Figure 6 also depicts the results of 
matching the results of Figure 5 with the 
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Figure 6. Tinker Toy model base. 
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Figure 7. Example of a figure and matching model. 
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Figure 8. Wheel-matching array, constraint-matching array, and example con¬ 
straint propagation links. 


21 models. The central goal of recognizing 
the target horse figure is accomplished, 
and reasonable scores are obtained by par¬ 
tial matching of approximately similar 
figures. Although the network was 
designed primarily to accept or reject 
matches of objects with the same number 
of disks, it also computes partial matches 
and matches of objects with differing 
numbers of disks. For example, compare 
the extracted figure with models 3 and 7 
(judged to be topologically quite similar) 
and models 4, 6, and 15 (judged fairly 
similar). 

The Necker cube and Tinker Toy exam¬ 
ples are instances of AI recognition prob¬ 
lems. Several other problems are like this, 
but many others are not. Can we apply 
structured connectionist models to other 
traditional AI issues such as knowledge 
representation and inference? There is 
much less completed research along these 
lines, but some promising starts have been 
made. The example in Figure 2 should con¬ 
vey the flavor of this work. 

The standard way to explore the issue of 
knowledge representation and inference is 
in terms of programs that can answer ques¬ 
tions. The many AI approaches to 
developing question-answering systems 
have the same basic requirements: One 
needs a way to store knowledge, to pose 
questions, and to compute and register 
answers. In a connectionist model, all 
these aspects must be expressed in terms of 
activity spreading among simple units like 
those in the previous examples. 

It is easiest to start with the recording of 
answers. In Figure 2, the possible tastes of 
foods form a winner-take-all network in 
which each unit inhibits the others so that 
only one answer will be active. The answer 
network is assumed to be part of a routine 
that also poses the question and acts upon 
the answer. The units that make up the 
routine are assumed to be activated in 
sequence from left to right, just like a stan¬ 
dard program. Activating the appropriate 
units sends a question to the knowledge 
network; Figure 2 shows this as links from 
the hexagonal node to the nodes for [has- 
taste] and [ham]. The key to this network’s 
operation is operation of the triangular¬ 
shaped nodes such as [bl] in Figure 2. A 
unit shown as a triangle is defined to 
become active when two of its inputs are 
simultaneously active. In this case, [ham] 
and [has-taste] are both on, so [bl] 
becomes active and activates [salty]. Now 
the [salty] node in the knowledge network 
spreads activation to the response [r-salty] 
back in the routine and the question is 
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Figure 9. Input consisting of two objects, with attention focused on the leftmost of 
the pair. 


answered. The same network can answer f 
questions like “Name a salty meat” when | 
activated appropriately. The answers 
returned by such a network depend on 
context, as people’s answers do; contex¬ 
tual bias is again modeled by activation. 

In addition, Shastri 5 showed that struc¬ 
tured connectionist knowledge represen¬ 
tations can handle problems that have 
proven difficult for logic-based 
approaches. Suppose we believe that 
Quakers tend to be pacifists and that 
Republicans are generally not pacifist. 
Given an individual who is both a Quaker 
and Republican, it is hard to decide how 
likely he is to be a pacifist. (The recent US 
President who had these two beliefs was 
also a Marine officer). Shastri’s system 
allows the relative strengths of conflicting 
beliefs and correlations to be combined 
according to maximum entropy rules of 
evidence and performs quite well. Again, 
the structured connectionist network pro¬ 
vides a natural mechanism for represent¬ 
ing the required knowledge and for 
capturing inferences based on partial, 
uncertain, and conflicting knowledge. 

An obvious question arising in connec¬ 
tion with work like Shastri’s is how a net¬ 
work like Figure 2 could be learned. The 
neural substrate of memory and learning 
is one of the great unsolved scientific ques¬ 
tions for which we certainly have no defini¬ 
tive answers. But there are connectionist 
theories of learning that are compatible 
with current brain research and are com¬ 
putationally feasible. 6 The key idea is that 
while new connections are rare, weight 
change in connections appears to be com¬ 
mon. We also know that each unit can 
have thousands of incoming and outgoing 
connections. Our hypothesis is that most 
of these connections are only potentially 
important and that learning involves 
strengthening the appropriate connec¬ 
tions. Suppose, for example, the network 
of Figure 2 needs to learn that spinach is 
a salty vegetable. Our model suggests that 
there are uncommitted triangular nodes 
that are weakly connected to many com¬ 
binations of objects, properties, and 
values. In an ideal case, one of them will 
be linked to [spinach], [has-taste] and 
[salty] among other things. This unit will 
get highly activated by the simultaneous 
activation of three of its neighbors and, by 
strengthening its active connections, can 
become dedicated to the new association. 

No implementation of this learning 
scheme has been attempted, but there are 
several related studies and some theoreti¬ 
cal work. 16 


Fanty’s work 17 is our most ambitious 
effort to date on learning in structured 
neural networks. An example system 
based on his work learns to classify struc¬ 
tured objects in a position-independent 
manner. Both structured objects and posi¬ 
tion independence present difficulties to 
the connectionist researcher for essentially 
the same reason. If a network is going to 
perform computations at several loca¬ 
tions, or for several subparts of a compli¬ 
cated object, at the same time, then there 
must be several copies of the computa¬ 
tional mechanism. This is problematical. 
It can lead to a combinatorial increase in 
the required number of units and links, an 
especially troublesome problem in learn¬ 
ing networks where something learned 
locally should be shared globally. The pro¬ 
posed solution is to go sequential. At some 
point, even connectionist networks must 
begin to compute sequentially. Imagine a 
story understander for which the input is 
entered simultaneously: all 100,000 words 
of a novel, each in its own word slot. The 
whole network is replete with redundant 
sentence-understanding subnetworks, 
subplot-understanding subnetworks, and 
so on. This, of course, is ridiculous. The 
idea behind the network described here is 
that some sequential processing is always 
necessary in reasoning about multiple 
objects; there can, however, be resource¬ 
intensive exceptions, such as low-level 
vision. 

For the implemented networks, each 
input consists of some number of subob¬ 


jects, each with a shape and color. The 
subobjects are located on a 4 x 4 grid. So 
a single input might be a red square over 
a blue circle, and it could appear in any of 
12 different locations. For each input, the 
network is provided feedback on classifi¬ 
cation. The goal is to have the network 
learn to classify the inputs. 

Figure 9 illustrates the network’s organ¬ 
ization. A grid of units corresponding to 
location is the extent to which the represen¬ 
tation is spread in space. There are four 
other populations of units, each with a dis¬ 
tinct role. There is a network of object 
properties, shown with four values for 
shape and color. There is a network of 
relations, shown with the four spatial rela¬ 
tions: above, below, left, and right. There 
is a population of hidden units, which will 
learn to classify subobjects. Finally, there 
is some number of classification units. 
Each complex input belongs to one of the 
classes. The network learns to classify the 
inputs based on feedback at this level. 

Notice that the position independence 
of the learning is guaranteed; the network 
is prestructured, which is the whole point. 
Learning occurs at the hidden and classi¬ 
fication units, which receive input from 
the property and relation units. The 
activity of these latter units is independent 
of the input’s location on the grid. The 
nature of the representation needs more 
detailed explanation. Each grid unit cor¬ 
responds to a location in the world. When 
one of these is active, it means something 
at that location is being attended to. The 
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grid units inhibit each other so that at most 
one can be strongly active. The grid units 
representing other objects in the current 
input are not totally quiescent. They main¬ 
tain a low level of activity, not enough to 
interfere with the objects of attention, but 
enough to keep them at the ready (and, 
potentially, enough to prime ongoing 
computations and shifts of attention). 
After being active for a while, a grid unit 
will tire and fall back to a low level of 
activity, and the strongest of the waiting 
grid units will reach full activation. Exter¬ 
nal influences could control the shift of 
attention through priming effects, though 
this does not happen in the example 
network. 

An object is entered one subpart at a 
time by externally activating a grid loca¬ 
tion and some properties. The external 
activation is very strong and causes a tem¬ 
porary binding between the grid object and 
the properties. This is effected via a large 
change in the link weight from the grid unit 
to the property unit. The weight change 
will stay in effect until the grid unit 
becomes totally quiescent. So, as the grid 
units cycle up and down, the associated 
properties do as well. In this example, the 
relations are all spatial and are activated 
by hard-wired relation detectors (not 
shown). In Figure 9, grid unit (4,2) is 
strongly active and grid unit (4,3) is weakly 
active. This activates the left relation: the 
focused-upon object is left of something. 
It also has shape 1 and color 1. 

The job of the hidden units is to classify 
the subobjects. Hidden units are only 
responsive to currently active units. In Fig¬ 
ure 9, an active hidden unit responds to the 
triple: (shape = 1, color = 1, left). Another 
hidden unit responds to (shape = 4, 
color = 3, right), but it is shown at a low 
level of activation because that object is 
not being attended to. In the current imple¬ 
mentation, the learning algorithm used for 
this stage is a variation of competitive 
learning. 14 This was chosen because it 
does not require feedback, which is harder 
to provide in such a dynamic environment. 
However, lack of feedback has numerous 
disadvantages, and research into other 
methods is ongoing. At the classification- 
unit level, evidence for complex objects 
must be accumulated over time. In this 
implementation the links to classification 
units have a memory. Thus, if they were 
recently active, they continue to provide 
support even after the source of the acti¬ 
vation has died down. This is not the ulti¬ 
mate solution, but it does work in this 
simple example. The learning rule for these 


links is simple. Active inputs to an active 
classification are increased; others are 
decreased. 

Many extensions and improvements are 
possible. The grid units could be replaced 
by general-purpose control units with no 
inherent semantics, which bind to object 
properties and relations much as the grid 
units do. Another idea, treating relations 
as distinct from properties at the hidden- 
unit level, would allow a distinct subobject 
learned in one context (for example, 
“below”) to be used in another context. 
Of course, the subobject distinction is arti¬ 
ficial since a complete object in one situa¬ 
tion can be a subobject in another. Finding 
clean ways for connectionist systems to 
switch levels of elaboration is another 
topic of current research in several labora¬ 
tories. 

A s these examples help point out, 
massively parallel, neural net- 
style computation presents a 
wide range of opportunities and 
challenges. It is certainly not a magic 
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Self-Organization in a 
Perceptual Network 

Ralph Linsker 
IBM Research 


A young animal or child perceives 
and identifies features in its envi¬ 
ronment in an apparently effort¬ 
less way. No presently known algorithms 
even approach this flexible, general- 
purpose perceptual capability. Discover¬ 
ing the principles that may underlie per¬ 
ceptual processing is important both for 
neuroscience and for the development of 
synthetic perceptual systems. 

Two important aspects of the mystery of 
perception are 

(1) What processing functions does the 
neural “machinery” perform on 
perceptual input, and what is the 
circuitry that implements these 
functions? 

(2) How does this “machinery” come 
to be? 

Unlike conventional computer hard¬ 
ware, neural circuitry is not hard-wired or 
specified as an explicit set of point-to-point 
connections. Instead it develops under the 
influence of a genetic specification and 
epigenetic factors, such as electrical 
activity, both before and after birth. How 
this happens is in large part unknown. 

Biological development processes are 
far too complex to hope that a relatively 
complete understanding of how a percep¬ 
tual system develops and functions will 
soon emerge. But we are familiar with 
complex synthetic systems, such as com¬ 
puters, whose principles of organization 
can be understood without one’s knowing 


How can a perceptual 
system develop to 
recognize specific 
features of its 
environment, without 
being told which 
features it should 
analyze, or even 
whether its 
identifications are 
correct? 


in detail how the components work. Fur¬ 
thermore, the same principles can be used 
to build computers in any of several differ¬ 
ent technologies. Might there be organiz¬ 
ing principles 

(1) that explain some essential aspects 
of how a perceptual system 
develops and functions; 

(2) that we can attempt to infer without 
waiting for far more detailed exper¬ 

imental information; and 


(3) that can lead to profitable experi¬ 
mental programs, testable predic¬ 
tions, and applications to synthetic 
perception as well as neuroscientific 
understanding? 

I believe the answer is yes, and that the use 
of theoretical neural networks that 
embody biologically-motivated rules and 
constraints is a powerful tool in this study. 

This optimism is encouraged by recent 
work * 1 in which I have found that a mul¬ 
tilayered network, developing according 
to simple yet biologically plausible “Hebb- 
type” rules, 2 self-organizes to produce 
feature-analyzing “cells.” These “cells” 
have response properties that are qualita¬ 
tively similar to those cells of the first few 
processing stages of the mammalian visual 
system. 3 These properties include sensitiv¬ 
ity to light-dark contrast and sensitivity to 
the orientation of an edge or bar. These 
properties develop before birth in certain 
animals, hence before structured visual 
experience, and in the theoretical network 
the corresponding properties develop even 
.in the absence of structured input, using 
only random signaling activity in the input 
layer of the network. 

Why does a feature-analyzing function 
emerge from these development rules? Is 
it a mere accident or curiosity? Or are the 
development rules perhaps acting to 
optimize some quantity that is important 
to the information processing function of 
a perceptual system? 
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In this article, I briefly summarize the 
network ideas from an earlier 
publication 1 and review some of the main 
results. This sets the stage for exploring 
why a feature-analyzing function emerges. 
I then show that even a single developing 
cell of a layered network exhibits a remark¬ 
able set of optimization properties. These 
properties are closely related to issues in 
statistics, theoretical physics, adaptive sig¬ 
nal processing, the formation of knowl¬ 
edge representations in artificial 
intelligence, and information theory. 

Next, I use these results to infer an 
information-theoretic principle that can be 
applied to the network as a whole, rather 
than a single cell. The organizing principle 
I propose is that the network connections 
develop in such a way as to maximize the 
amount of information that is preserved 
when signals are transformed at each 
processing stage, subject to certain con¬ 
straints. 

I illustrate how this principle works for 
some very simple cases. Much more work 
will be needed to apply the principle to 
practical computations of biologically 
important cases, but the approach appears 
very promising. I conclude with some 
speculative comments on why this princi¬ 
ple, or some variant of it, may be impor¬ 
tant for the emergence of perceptual 
function in biological and synthetic 
systems. 

A layered self-adaptive 
network 

The visual system is the best studied per¬ 
ceptual system in mammals. Visual infor¬ 
mation is processed in stages. Simple 
aspects of form, such as contrast and edge 
orientation, are analyzed in the earlier 
stages; more complex features are ana¬ 
lyzed later. Other aspects of visual process¬ 
ing, such as color and motion analysis, 
proceed in parallel with the analysis of 
form. 

Both the retina and cortex are organized 
into layers of cells with interconnections 
within and between layers. Within an ana¬ 
tomical layer, at least for the early process¬ 
ing stages, there is a population of cells 
each of which performs approximately the 
same processing function on its inputs. 
This population of cells can be thought of 
as an array of filters. Each cell processes 
input from a limited region of visual space, 
called the “receptive field” of that cell. 
More than one population of cells can 
share an anatomical layer. 


Many cells respond to input activity by 
firing an electrical pulse, or action poten¬ 
tial, that travels down the output fiber, or 
axon. These pulses cause a chemical neu¬ 
rotransmitter substance to be released at 
synapses, or regions of near-contact with 
other cells. The latter cells receive and pro¬ 
cess these chemical input signals. Some 
cells, for example in the retina, do not pro¬ 
duce action potentials, but instead exhibit 
more graded electrochemical phenomena 
that can be used for signaling. 

Although a cell’s response function is in 
general nonlinear, visual neurophysiolo¬ 
gists have found that for many cells, a lin¬ 
ear summation approximation is 
appropriate. In this approximation, the 
cell’s output response varies monotoni- 
cally with some linear combination of the 
cell’s input signal values. For cells that pro¬ 
duce action potentials, the output response 
can be defined as the firing rate at which 
the cell generates action potential pulses in 
response to its input signals. 

Specification of the network. Will a sim¬ 
ple self-adaptive network develop feature¬ 
analyzing cells without our specifying 
which features are to be analyzed? If it 
does, are these cell types related to those 
observed in biological systems? To address 
these questions, we first study a network 
that embodies some of the important bio¬ 
logical properties described above, but 
omits many complicating factors. This 
approach is useful both because many of 
the details are unknown, and because our 
goal is to understand what principles are 
most important for the development of 
perceptual functions. For example, if we 
want to know how nonlinearity of 
response may be important for develop¬ 
ment, it is valuable to see first whether a 
linear response system exhibits the main 
feature-analyzing properties that are bio¬ 
logically observed. Also, feedback connec¬ 
tions from later to earlier processing stages 
are known to exist, but it is not known how 
these connections might relate to the devel¬ 
opment of feature-analyzing functions. 
(There are many other functions that feed¬ 
back may serve, such as control of 
dynamic range, attentional mechanisms, 
and so on.) We choose to analyze networks 
without feedback, to understand their 
developmental properties first. 

The interconnections within the retina 
are known to be more complicated than a 
simple feedforward arrangement. Also, 
mechanisms that are not dependent on 
neural activity appear to be involved in the 
development of some feature-analyzing 


properties. The main purpose of our simu¬ 
lations is to explore what types of simple 
yet biologically plausible development 
rules suffice to generate feature-analyzing 
cell assemblies, rather than to rule out 
other ways of generating them. From the 
results of our simple model, we will infer 
a potential organizing principle that can 
encompass nonlinear cell response, more 
complex connectivity, and a variety of 
ways of forming and modifying con¬ 
nections. 

Our network is shown in Figure 1. The 
cells are organized into two-dimensional 
layers A, B, C, and so on, with feedfor¬ 
ward connections to each cell from an 
overlying neighborhood of cells of the 
previous layer. Layer A receives input 
from the visual world (if there is any such 
input). We focus especially on the case in 
which there is no input, but instead only 
random activity of the cells of layer A, 
with no correlation of activity from one 
cell to the next. This activity resembles ran¬ 
dom noise or snow on a TV screen. We 
consider this case in order to understand 
how certain feature-analyzing cells may 
emerge even before birth, as has been 
observed in certain primates. 

The positions of the connections to each 
cell need not be regular as in Figure 1, but 
can be chosen randomly according to a 
density distribution, such as a Gaussian, 
that favors connections from nearby cells 
of the previous layer. For simplicity, these 
positions are fixed for the duration of the 
development process. Each cell, at each 
time, has some signaling activity which we 
denote by a real number. Each cell exhibits 
a simple linear response, that is, the out¬ 
put is a linear combination of the inputs, 
with each input being weighted by a con¬ 
nection strength that will develop in a cer¬ 
tain way. Each model cell thus acts as a 
linear filter. 

Two points should be noted: 

(1) Defining the output response as a 
nonlinear, for example sigmoid, function 
of the weighted sum of the inputs would 
more closely approximate some properties 
of the firing rates of biological neurons. 
These are always nonnegative and saturate 
at some maximum rate. However, we will 
see that even a linear response rule can lead 
to the formation of feature-analyzing cells, 
and we will explore what properties of lin¬ 
ear adaptive filters are responsible for this 
formation. Some of the insights gained 
will be applicable to the nonlinear response 
case as well. 

(2) Any transformation implemented 
by a feedforward sequence of layers of lin- 
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Figure 1. A layered self-adaptive network with local feedforward connections. 
Each two-dimensional layer contains many cells. Five input connections to each of 
two cells in layers B, C, and D are shown. Several hundred inputs to each cell are 
used in simulations. Each cell also provides input to many cells of the following 
layer. Lateral connections within a layer, as discussed in the text, are not indicated 
here. 


ear filters is a linear transformation, and 
hence could be implemented by a single 
layer of connections with properly chosen, 
or in this case hardwired, connection 
strengths. However, our purpose is not to 
implement a particular transformation, 
but rather to study what transformations 
are learned by a network without supervi¬ 
sion. This multistage learning process 
depends upon the presence of multiple net¬ 
work layers. 

A Hebb rule. For the development pro¬ 
cess, we use a version of an idea proposed 
by the neuropsychologist Donald Hebb in 
1949. This idea has been central to much 
work on synthetic neural networks over 
the years, as well as to the thinking of neu¬ 
roscientists about how the development of 
synaptic connections may relate to mem¬ 
ory and learning phenomena. Hebb’s idea 
was that if cell 1 is one of the cells provid¬ 
ing input to cell 2, and if cell l’s activity 
tends to be “high” whenever cell 2’s 
activity is “high”, then the future contri¬ 
bution that the firing of cell 1 makes to the 
firing of cell 2 should increase. 

In the language of neural networks, the 
connection strength is increased, or made 
more positive. A mathematical formula¬ 
tion needs to be more precise than this, and 
state under what conditions the strength 
may decrease. We use a form in which the 
change in strength contains a term propor¬ 
tional to the product of input and output 
activities at that connection. The Hebbian 
idea of modifying connection strengths 
according to the degree of correlated 
activity between input and output is cen¬ 
tral to what follows. 

For an analogy to a Hebbian rule, con¬ 
sider a group of people whose collective 
opinion on a question is by definition the 
weighted average of the opinions of its 
members. If, over time, a member’s opin¬ 
ion tends to agree with the group’s opin¬ 
ion, then the analog of the Hebb rule states 
that the individual member’s vote on 
future issues is to be weighted more 
strongly. The member’s vote is given less 
weight, or even negative weight, if he con¬ 
sistently disagrees with the group’s opin¬ 
ion. This type of positive-feedback control 
of weighting factors tends to lead to con¬ 
sensus within the group. As we shall see, 
it has other surprising consequences for 
the properties of the group, or output cell, 
response. 

Mathematical formulation. This sub¬ 
section and the next summarize simula¬ 
tions that are described in detail in my 


previous work. 1 

Consider a cell M and the cells Li, L 2 , 
. . . , Ljv that provide input to M. For 
simplicity, we avoid treating effects that 
depend upon the time sequence of signal 
activity values. Instead, we think of the 
activity history of a layer as a set of “snap¬ 
shots, ’ ’ in which the ordering of the snap¬ 
shots plays no role. That is, a set of activity 
values, denoted by (LJ, LI, , LJv), is 
presented as input to the M cell, the M cell 
generates an output activity value M", 
and a new set of input activities is then 
presented. The superscript n indexes the 
presentation of inputs, that is, the partic¬ 
ular snapshot, and the corresponding out¬ 
put. Then the linear response rule is 

M" = a l +1.jL*Cj (1) 

where C; is the strength of the y'th input 
connection to the M cell. Our Hebb-type 
rule is 

(A c i )" = a 2 L'iM’' + a}L" + a^M" + as (2) 


where the «’s are arbitrary constants 
(<af 2 > 0). We assume that the c values 
change slowly from one presentation to the 
next. Then we can average Equation 2 over 
an ensemble of many presentations, and 
use Equation 1 to express M n in terms of 
the {Lj } to obtain the rate of change of 
each c value. Some algebraic 
manipulation 1 gives 

c i = 'L j Q ij Cj+[k i +(k 1 /N)1c J \ (3) 

where k U2 are particular combinations of 
the constants a U5 . Apart from the deter¬ 
mined values of £i, 2 > the constants a M 
play no further role in what follows. Here 

Q,j m < (L| —. Z) x (Lj- L) > (4) 

is the covariance of the activities of input 
cells i and j, where < . . . > and the 
overbar both denote the ensemble average. 
(For our purposes, L, the ensemble aver¬ 
age of the input activity at a synapse, can 
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Figure 2. Receptive field map of a computed orientation-selective cell. A point of 
illumination at any position in the plane evokes an output response from the model 
cell that is proportional to the contour value at that position. Positive contour 
values (solid curves) denote an excitatory output response; negative values (dotted 
curves) denote an inhibitory response. Contour values range from - 0.45 to + 0.75 
in steps of 0.30. The peak response (at the receptive field center) is normalized to 
unity. The parameter values that generated this particular orientation-selective cell, 
and the units (r G ) of distance along the axes, are given in reference 1. (See Figure 
la, p. 8780). Axes denote distance of illumination point from receptive field center. 


be taken to be the same for all synapses i, 
j .) The appearance of the input covariance 
matrix Q does not mean that there is any 
direct interaction between synapses i and 
j. Q appears simply because the Hebb rule 
causes c, to depend upon the product 
, and M" in turn depends upon 
all the {L" } values (via Equation 1). The 
Q matrix will play an important role in 
what follows. 

To prevent c values from becoming 
infinite during the development process, a 
saturation constraint is imposed. Each c 
value is constrained to lie between two 
values c_ and c+. In a more biologically 
realistic case, there are excitatory synapses 
that have 0<c<c + and inhibitory syn¬ 
apses that have c_ <c< 0. The analysis 
of this case gives the same result. 


First the connections from layer A to B 
mature, or develop to their final values. 
That is, the initial c values are chosen at 
random, the set of differential equations 
given by Equation 3 (for / = 1,2,. . . , N) 
is solved, using the Qy function that 
applies to layer A activity. (For random 
snow activity in layer A, Qy is 1 when /' 
and j are the same A cell, and 0 otherwise.) 
Knowing the mature c values for the A-to- 
B connections, as well as the Qy function 
for layer A, then allows us to compute the 
Qy function for the mature layer B. Then 
the development of the B-to-C connections 
is computed, using the Qy function 
appropriate to layer B. By repeating the 
process, we compute in turn the connec¬ 
tion strengths for successive layers of con¬ 
nections. 


Simulation results. A few parameters 
for each layer of cells determine the mature 
c values of the cells in that layer. These 
parameters include k t and k 2 and the 
breadth of the region in the previous layer 
that provides input to a cell of the develop¬ 
ing layer. (See Figure 1.) As we shall see, 
the choice of the k u2 values determines 
the mature value of the total connection 
strength 1cj of the inputs to the M cell. 

When we explore the parameter space, 
we find that there are a limited number of 
ways each layer can develop. Briefly, we 
find that a sequence of feature-analyzing 
cell types emerges as one layer after 
another matures. 

The first cell type emerges in layer B. 
There is a parameter regime in which each 
c value reaches its excitatory limit c + . In 
this case, each B cell, once it has matured, 
computes the local average of the activity 
in the overlying region of layer A from 
which it receives input. 

Once the B cells have matured in this 
way, nearby B cells have correlated 
activity. Each activity pattern in layer B is 
a blurred image of random snow. If one B 
cell’s activity happens to be “high” at a 
given time, its neighbors’ activities are 
likely to be “high’ ’ also. As a result of this 
activity correlation, a new cell type 
emerges in layer C. This center-surround 
cell type 1 acts as a contrast-sensitive 
filter—it responds maximally to a bright 
circular spot centered on the cell’s recep¬ 
tive field, against a dark background. 
Center-surround cells having the reverse 
property—they respond maximally to a 
dark spot on a bright background—also 
emerge. 

The Q function for pairs of center- 
surround cells in layer C determines the 
developmental possibilities for the C-to-D 
connections, and so on. We find that the 
next new type of feature-analyzing cell to 
emerge as we pass to succeeding layers is 
an orientation-selective cell. This cell 
responds maximally to a bright edge or bar 
against a dark background, or the reverse, 
when the edge or bar has a particular 
orientation. The receptive field map for 
such a computed cell is shown in Figure 2. 
This map is a contour plot showing the 
response of the cell to point illumination, 
as a function of the position of the illumi¬ 
nation in visual space. 

Each orientation-selective cell will 
develop to favor an arbitrary orientation 
if the network contains only feedforward 
connections as in Figure 1. However, if 
lateral connections between nearby cells of 
the orientation-selective cell layer are 
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included in the simulation, then the orien¬ 
tation preferences of the cells in the layer 
can become organized in certain arrange¬ 
ments. Cells having similar orientation 
preferences develop to occupy irregular 
band-shaped regions. (See reference 1 and 
the front cover, right side, of this issue.) 

Discussion of the simulations. Center- 
surround cells are a prominent feature of 
mammalian retina. Orientation-selective 
cells emerge in cat and monkey visual cor¬ 
tex. 3 ' 4 Irregular band-shaped regions of 
cells of similar orientation—called orien¬ 
tation columns —are a prominent feature 
in the orientation-selective cell layers. 3,4 
(Once again, see the front cover, left side.) 
The role that lateral connections in cortex 
play in the formation of orientation selec¬ 
tivity is at present experimentally unset¬ 
tled. As we noted, certain primates exhibit 
well-formed orientation selectivity at 
birth, in the absence of any structured vis¬ 
ual experience. 

Our point is not to suggest that feature¬ 
analyzing cells—particularly the center- 
surround cells—arise in animals in the 
same way they do in this synthetic net¬ 
work. As noted previously, the anatomy 
of inter-layer connections in the retina is 
more complex than a simple feedforward 
arrangement. Furthermore, center- 
surround cells can be constructed by a sim¬ 
ple non-adaptive model in which excita¬ 
tory inputs from some narrow region, and 
inhibitory inputs from a broader region, 
both converge on a cell. In our simulations 
we assumed that the breadth of the input 
region to a cell was the same for excitatory 
and inhibitory synapses, in order to avoid 
biasing the solution toward the formation 
of a center-surround cell type. 

Our point is rather that a set of progres¬ 
sively more complex feature-analyzing cell 
types develops in the layered network, and 
that these cell types, and their organiza¬ 
tion, qualitatively exhibit some of the most 
salient features found in the first few stages 
of mammalian visual processing. The 
results suggest that some properties whose 
origin has been mysterious—such as orien¬ 
tation selectivity — may have a natural 
explanation in terms of the functioning of 
a Hebb-type development process in a 
layered network. 

Two simple examples of ho w structured 
input to layer A would affect the simula¬ 
tion results are worth noting: 

(1) If nearby pixels have correlated 
intensity values, and this is the only impor¬ 
tant input correlation present, then Q in 
layer A would resemble the Gaussian Q 


that we found in layer B. The subsequent 
development of the model would proceed 
in a way similar to that which we 
described, except that the appearance of 
each feature-analyzing cell type could be 
advanced one layer. 

(2) If layer A is shown an ensemble of 
patterns, each consisting of sinusoidal 
stripes with arbitrary phase and orienta¬ 
tion, then orientation selectivity can 
develop as early as layer B. 1 

We have assumed, for simplicity, that 
the statistical properties of the ensemble of 
presentations, that is, the covariances Qy, 
are unchanged or stationary during devel¬ 
opment. If the ensemble statistics change, 
cells that had reached their apparently 
final mature c values may change these c 
values in accordance with the new ensem¬ 
ble characteristics. Thus, although we 
always speak of cell development, the pres¬ 
ent approach is equally applicable to 
studying questions of cell plasticity during 
the life of the animal. 


Hebb rules and 
optimization properties 

We have seen that even a simple layered 
network with local feedforward connec¬ 
tions obeying a Hebb-type rule develops a 
sequence of progressively more sophisti¬ 
cated feature-analyzing properties as we 
pass from one layer to the next. We will 
now examine some remarkable optimiza¬ 
tion properties of a Hebb-type rule. 

Maximization of output activity vari¬ 
ance. Consider a cell M that receives input 
from cells Li, L 2 , . . . , L ;V . Here and 
later, “input” means local input to cell M, 
not the environmental input to the net¬ 
work as a whole. Similarly, “output” 
refers to the M cell’s activity value, not to 
the output from the network as a whole. 
Let the M cell’s development be de¬ 
scribed as in Equations 1-4, with a satu¬ 
ration constraint on the range of each c 
value. We assume that the ensemble 
statistical properties of the L-cell activities, 
that is, the Q tJ function for the L cells as in 
Equation 4, are unaffected by the choice 
of c values. This is true if there is no feed¬ 
back from M, or the cells it influences, to 
the L cells. It should be a satisfactory 
approximation if the feedback is present 
but is sufficiently weak, although this has 
not been studied quantitatively. 

Define the function 


E=E Q + E k (5) 

where 

Eq= — (1/2)<(AT i —A/) 2 > 

= -(1/2)1,X,G<,c,c, (6) 

and 

£*= - k{LCj - (k 2 /2N)tXCj) 2 (7) 

I have constructed the function E to 
have the property that - dE/dCj = c, for 
each This means that, as the Hebb rule 
causes each of the c values to change with 
time, the value of E, as a function of the 
c’s, decreases along a path of locally 
steepest, or gradient, descent. (If c,>0, 
then dE/dCj< 0, so c, increases and E 
decreases with time. If c,<0, then 
dE/dci>0, so c, decreases and E again 
decreases with time.) 

The value of E thus achieves a local 
minimum at cell maturity. Moreover, for 
the cases of interest here—including those 
that lead to the center-surround and 
orientation-selective cell types—this mini¬ 
mum is a global near-minimum as well. 1 
We therefore will focus on the case in 
which the development process does not 
get stuck in high-lying local minima. This 
appears to be the typical case for a percep¬ 
tual network exposed to a large ensemble 
of presentations, although it is an empiri¬ 
cal finding and I have not established the 
limits of its validity. 

What is the meaning of E achieving a 
global, or absolute, minimum value? For 
any given value of total connection 
strength_Xc y , E is minimized when 
<(M"-Af) 2 >—the statistical variance 
of M—is maximized. Changing the values 
of the parameters k l 2 adjusts, or tunes, 
the mature value of 2cj. The E k term, 
which is a function of 2cj and k x l only, 
plays a role similar to a Lagrange mul¬ 
tiplier term, although E k is parabolic 
rather than linear in 2cj. 

Therefore, the development rule of 
Equation 3 causes a cell to develop so as to 
maximize the variance of its output 
activity, subject to the constraint that the 
total connection strength have a given, 
parameter-determined, value and subject 
to the saturation bounds for each c value. 
Let us see intuitively what variance max¬ 
imization means to a perceptual system. 

Consider first a hypothetical M cell 
whose c values are such that the cell’s out¬ 
put variance is zero. That is, regardless of 
the input values (L",Z-2.LJv) cho¬ 

sen from the ensemble of presentations, 
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Figure 3. Relationship between networks: (a) a single M cell (with A inputs) of a 
layered self-adaptive network; (b) a Hopfield network with A cells and A(A—1)/2 
connections, where A= 5. 


the output is always the same. This cell 
would be useless for conveying any infor¬ 
mation about the environment to later 
parts of the perceptual system. 

On the other hand, if the c values are 
chosen in a different and special way, then 
the M cell’s output value exhibits the 
largest possible spread or variance, consis¬ 
tent with the constraints on the c’s, as the 
set of input values ranges over its ensem¬ 
ble. We have shown that a Hebb-type rule 
tends to generate c values satisfying this 
special condition. In an informal sense, 
provided certain conditions are met, the 
Hebb rule acting on our described M cell 
tends to produce an M cell whose output 
activity optimally preserves the informa¬ 
tion contained in the set of input activities. 
Later, we will make this statement more 
precise, by applying some concepts from 
information theory, and we will modify it 
to accommodate the situation in which 
multiple M cells interact with one another. 

Optimization in another type of neural 
network. Hopfield 5 emphasized that the 
dynamics of a neural network can be 
described in some cases by the local 
minimization of a function. An interesting 
mathematical relationship exists between 
the E function defined in Equations 5-7 
and Hopfield’s energy function— 
although the network structure and 
behavior that each describes are very 
different. 


Once again, our E function is 
E=E Q + E k where E Q is shown in Equa¬ 
tion 6. The development rule causes E e to 
be minimized subject to the constraint that 
1cj have a specified value and subject to 
the saturation constraints on each c value. 
The arrangement described by Equation 6 
consists of one M cell with A inputs from 
cells L,. L 2 , . . . , L/v and is shown in 
Figure 3a. The Ax A matrix of elements 
Qij is the covariance matrix of the input 
cell activities. The c’s are the connection 
strengths from each input cell to the out¬ 
put cell. The minimization of E describes 
the development of the c’s under the 
influence of the ensemble of inputs charac¬ 
terized by the covariance matrix Q. 

In Hopfield’s case, 5 as illustrated in 
Figure 3b, there are N cells and the activity 
state of the z'th cell is called V t . Each pair 
of cells is connected with fixed connection 
strength T u , so the number of connec¬ 
tions is of order A 2 /2, and the energy 
function is 

E'= —(l/2)1iZjTijVjVj (8) 

The activities V, change with time 
according to a linear summation rule with 
a threshold: V, increases, unless it is 
already at its upper limit, if 1T iJ V J >Q, 
and decreases if ir o K y <0. If the T u 
matrix is symmetric, then the K,’s change 
so as to decrease the value of E' to a local 
minimum. Connection strengths are fixed; 


there is no learning or network develop¬ 
ment. The dynamical process described by 
Equation 8 is the change in the activities 
{ Vj} from some initial state to a final state 
of locally minimum E'. If we want to use 
the network for memory retrieval, a suit¬ 
able choice of Tjj is given by an expression 
that is essentially the covariance of K* 
and K/ over the ensemble of memories, 
indexed by k, to be stored. 

Note that E' has the identical structure 
as our E q , if we identify V, with c, and 7j y 
with Qij. When 7"is a covariance matrix, 
Hopfield’s network computes a local mini¬ 
mum of E' using A cells and order A 2 /2 
connections, explicitly embodying the T 
values. The state for which E' is minimal 
is the set of final activities (V u V 2 ,. . ., 
Fn). 

One cell of our network computes a 
local minimum of the same function, our 
Eq, using A connections. The Q function, 
which corresponds to T, is nowhere 
explicitly represented in the network. The 
Hebb rule implicitly responds to the covar¬ 
iance matrix, Q, as the ensemble of input 
patterns is presented to the M cell. The 
state for which Eq is minimal is not a set 
of activities, but a set of mature connec¬ 
tion strengths (ci, c 2 , . . . , c N ). 

Thus, for T matrices that are covariance 
matrices, one cell of our network can 
locally optimize the same function as a 
fully connected Hopfield-type network. In 
our network, this optimization process 
consists of developing a final set of c 
values, starting with some initial set of 
values, under the influence of a statistically 
stationary ensemble of input patterns hav¬ 
ing covariance matrix T. In the Hopfield- 
type network case, the process consists of 
seeking a final set of cell activity values 
starting with some initial set of values, in 
a network whose connection strengths are 
fixed and prespecified to be the T values 
themselves. 

These considerations lead to an interest¬ 
ing connection, only briefly outlined here, 
between memory retrieval and perception 
in a network model. 

Memory retrieval and perception in a 
network model. If there are sufficiently 
few memory patterns to be stored, relative 
to A, then E' or Eq will tend to have 
minima at the { K,} or {c,} values, respec¬ 
tively, corresponding to those memories. 
Depending upon the initial choice of the 
V’s or c’s, one or another of these mem¬ 
ory states will be activated or selected. 
In the case of Hopfield’s network, “acti¬ 
vated” means that the final activity state 
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will match one of the stored memories. In 
the case of a cell in a layered self-adaptive 
network, “selected” means that the final 
set of c values will cause the M cell to be a 
matched filter for one of these memories. 
That is, the mature M cell will respond 
most strongly when presented with the set 
of input activities corresponding to that 
memory. 

If the number of patterns in the ensem¬ 
ble is large, then the Eq function will no 
longer capture details of any one of the 
patterns. The structure of the E Q function 
may become simpler. The global minimum 
of Eq will lie at the (cj, c 2 , ... ) value 
for which the M cell’s variance is max¬ 
imized. The mature M cell will function as 
a feature-analyzing cell, rather than as a 
matched filter to a particular memory. The 
particular feature or pattern element to 
which the mature cell will optimally 
respond, such as an oriented edge, need 
not even appear in any of the presented 
patterns. 

Principal component analysis. There is 
a special case in which variance maximiza¬ 
tion corresponds to an important, and 
widely-used, statistical method for feature 
extraction. This is the case in which the 
output variance is maximized subject to 
the constraint that 2c, 2 = 1. Oja 6 showed 
that this maximization can be achieved by 
using a particular form of the Hebb rule, 
equivalent to 

c,oc<M T, (L"-M"c,)> (9) 

For this expression, we put M " = 2L, , 'c, 
and define the activities, subtracting non¬ 
zero mean values if necessary, so that 
< L"> = 0 for all The additional term in 
the Hebb-type rule, proportional to c„ 
causes 2c, 2 to be close to 1, and no explicit 
constraint needs to be imposed. 

In statistics, principal component anal¬ 
ysis, or PCA, is a standard method, 
reviewed in Huber, 7 for identifying 
“interesting” but unanticipated structure, 
such as clustering, in high-dimensional 
data sets. For example, an economist con¬ 
fronted with 1000 dimensions of data, 
such as the prices of different commodi¬ 
ties, may want to know which several fea¬ 
tures of the data, for example, which 
several linear combinations of the 1000 
quantities, are most salient. 

PCA works as follows. Consider a set of 
data points indexed by n, each point L" 

having coordinates (Lf, L\ . L" N ). 

For PCA, we compute a vector c for which 
the projection of the set of data points 



Figure 4. Illustration of principal component analysis. A cloud of data points is 
shown in two dimensions, and the density plots formed by projecting this cloud 
onto each of two axes 1 and 2 are indicated. The projection onto axis 1 has maxi¬ 
mum variance, and clearly shows the bimodal, or clustered, character of the data. 


onto the axis parallel to c has maximum 
variance. The projection of L" onto c, 
when 2c/=l, is just A/" = 2,L"c,, and 
the variance of the projected distribution 
is identical to the variance of A/". 

An example of PCA is illustrated in Fig¬ 
ure 4. Projecting the cloud of data points 
onto line 1 captures the salient feature of 
the data—that there are two clusters. The 
variance, or spread, of the data points 
along this axis is greater than for any other 
projection axis. Projecting the cloud onto 
line 2 would obscure the cluster structure. 
While the cluster structure is evident in the 
raw data of the two-dimensional plot 
shown here, such structure is often totally 
concealed in high-dimensional data sets, 
until an analysis method such as PCA is 
applied. 

Since the PCA method corresponds to 
choosing c so as to maximize the variance 
of M" subject to 2c, 2 = 1, it follows that 
the mature M cell generated by Oja’s ver¬ 
sion of the Hebb rule performs PCA on its 
set of inputs. 6 


Optimal inference. Consider an arbi¬ 
trary M cell characterized by a set of c 
values and having the linear response rule 
M" = 2,L/C, with <L"j> = 0 for all i. 
Suppose we know the c values, and are told 
a particular value of the output, M \ We 
are asked to estimate the input activities 
(L\, L 2 , ... , L" n ) for that presentation. 
Let us score any such estimate by 

(1) computing the difference between 
the estimate L" (est) and the true 
value of L" 

(2) squaring this difference, and 

(3) summing this squared error over /'. 

Averaging this score over an ensemble of 
presentations gives the mean square error 

MSE = 2, < [L" - L"(est)] 2 > (10) 

What estimation rule will give the best, 
meaning the minimum, MSE? For a linear 
estimation rule of the form L"(est)s 
g/M", where we want to know what g 
values to use, the answer is found by 
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minimizing MSE with respect to each of 
the g’ s. This is easily done by differentiat¬ 
ing MSE. It is also a simple case of the 
Gauss-Markoff theorem, 8 which applies 
more generally to the optimal estimation 
of a set of inputs given a set of outputs, 
rather than just one output. The result is 

/."(opt est) = 

AT x (l J Q u c J )/a,ljC,Q iJ c J ) (11) 

The MSE corresponding to this optimal 
estimate is then 

MSE(opt) as Z,- < m - /."(opt est)] 2 > 

= X,<(/,,") 2 >-// (12) 

where 

Hm [I,(Z ■jQ u Cjfy&£jcQ f jCj) (13) 

Expressed in matrix form, with c denoting 
the column vector (Ci, c 2 , ... ) and Q 
denoting the matrix (Q,y), we have H= 
(c t QQc)/(c t Qc), where the superscript T 
denotes the matrix transpose. 

The calculation so far involves a stan¬ 
dard use of optimal estimation theory. 8 
The linear filter, represented here by the set 
of c values, is specified. The result of a 
measurement using the filter—that is, the 
output value—is given. The task is to 
reconstruct the input values with minimum 
error, using a simple mean squared error 
criterion. 

We now go beyond this simple frame¬ 
work to ask 9 : For what linear filter—what 
set of c values—is this minimum-error 
reconstruction the most accurate? That is, 
what choice of c’s minimizes MSE(opt) of 
Equation 12? 

Since the first term on the right-hand 
side of Equation 12 is independent of the 
c’s, minimizing MSE(opt) is accomplished 
by maximizing H. The mathematical con¬ 
dition for this to occur is that the vector c 
be an eigenvector of Q having maximal 
eigenvalue. This is identical to the condi¬ 
tion that c needs to satisfy in order for the 
M cell to perform PCA on its input values. 

Therefore the PCA condition and the 
principle of optimal inference—namely, 
that MSE(opt) be minimized, or H be 
maximized—lead to the same set of c 
values. A Hebb rule of the form of Equa¬ 
tion 9 generates an M cell that satisfies 
both conditions. In the presence of other 
constraints, or additional cost terms, there 
is no guarantee that PCA and H- 
maximization are equivalent, since the 
PCA principle maximizes the quantity 


(c t Qc)/(c t c) which is not identical to the 
expression for //in Equation 13. 

Optimization in the presence of process¬ 
ing noise and constraints on output vari¬ 
ance. We have identified several 
optimization properties related to the cell’s 
output variance. Suppose, however, that 
for some reason the variance is itself con¬ 
strained. For example, the output activity 
may be confined to lie within some oper¬ 
ating range. This is a biologically plausi¬ 
ble situation. In this case, what is 
optimized by a suitable Hebb-type rule? 
We will discuss this case for a particular 
processing model, giving only the main 
results and omitting the details. 

Suppose the signal Lf on they'th input 
line or connection is corrupted by noise, 
v], where vjhas a mean of zero and a var¬ 
iance B, and is uncorrelated both with the 
noise on other input lines and with any of 
the input signals LJ. The cell computes the 
weighted sum x=’Lj(Vj + v'j)c J . The vari¬ 
ance of xris the sum of two terms: the var¬ 
iance due to the signal in the absence of 
noise, X,yQ,yC,c,; and the variance due to 
the noise, Blq. Consider a suitable syn¬ 
aptic modification rule in which c, con¬ 
tains a term of the form <LjX>. This 
rule causes the model cell to develop such 
that the variance of x due to the signal is 
maximized relative to the variance due to 
the noise. This type of signal-to-noise 
optimization property can also emerge 
when the cell’s output M is a monotonic 
nonlinear function of x, such as a sigmoid 
function, if the synaptic modification rule 
is of the form described. 

Adaptive signal processing. Returning 
to the case of a linear-response model neu¬ 
ron, suppose we wish to train a linear cell 
to respond to each of a set of prescribed 
input vectors by generating an output that 
best matches a prescribed desired output. 
An input vector is denoted L"s(L", L\, 
.... L" n ) and each desired output is a 
scalar number M" ies . The actual output is 
Af" = IjLj'Cj where each </."> =0 and 
the optimal c values are to be determined 
by a learning process. A mean square mea¬ 
sure of error is used: 

MSE' = < (AT - A/des) 2 > (14) 

where < . . . > again indicates the 
ensemble average. MSE ' is a minimum 
when the c values are chosen to satisfy 
<Mi es L"> =1jQijCj for all i. (Recall that 
Qij= </,"/,/>.) The least mean square, 
or LMS, algorithm of Widrow and 


Hoff 10 uses an estimate of the gradient of 
MSE' and in effect performs gradient 
descent to compute the optimal c values. 
An ensemble-averaged form of the algo¬ 
rithm can be written as 

c i * < L,(Ml a -I.jL'}c J )> (15) 

Equations 14 and 15 give an objective 
function to be minimized and an algorithm 
for a supervised learning process. Both 
the inputs and the desired outputs are 
presented to the cell, and the error term 
(A/des-AT 1 )—the amount by which the 
actual output differs from the desired 
output—is fed back to change the c values 
until the mean square error is minimized. 

Our optimal inference criterion, 
namely, the minimization of the objective 
function of Equation 12, and a Hebb-type 
rule that implements it (Equation 9) are 
formally similar to Equations 14 and 15. 
But the optimal inference criterion pro¬ 
vides a method for unsupervised learning. 
The criterion does not make any use of a 
desired output; it simply states that the M 
cell should have the property that know¬ 
ing its output activity value allows one to 
infer the input activities with greatest pos¬ 
sible accuracy. 

Information theory 
and the principle of 
maximum information 
preservation 

For a single M cell receiving inputs from 
a given set of L cells, we have seen that, for 
a particular Hebb rule given in Equation 
9, knowledge of the output activity value 
allows inference of the input values with 
greatest accuracy, in the sense of minimum 
mean squared error. For more general 
Hebb-type rules, we found that the vari¬ 
ance of the output activity was maximized 
subject to various constraints. This result 
led us to suggest that, at least in an intui¬ 
tive sense, a Hebb rule may act to gener¬ 
ate an M cell whose output activity 
preserves maximum information about 
the input activities, subject to constraints. 

We will now make this notion of maxi¬ 
mum information preservation more pre¬ 
cise, and will extend it to the case of an 
entire layer of M cells, by introducing 
some concepts from information theory. 
The goal is to see what this principle 
implies for the development of each layer 
of a perceptual system. That is, given the 
statistical properties of the ensemble of 
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input patterns at layer L, and certain con¬ 
straints, what particular processing func¬ 
tions do the connections from layer L to 
layer M, and within layer M, develop to 
implement? 

Shannon information. We will regard 
each presentation of real-valued inputs 
L = (Li, L 2 , , L n ) as a message, 

where 7,, denotes the activity of the ith L 
cell in the layer. We omit the n superscripts 
for clarity. Strictly speaking, even one real 
number carries an infinite amount of 
information. To avoid encountering 
expressions of the form, and 

because infinite precision is physically and 
biologically meaningless, we will think of 
the N-dimensional space of the L vectors 
as being divided into small boxes. Each 
box is labeled by its location L. Two mes¬ 
sages are regarded as identical if they lie in 
the same box. In the end, we will pass to 
the continuum limit, and the sums will 
become integrals. 

Given an ensemble of messages, let P(L) 
be the probability that a randomly chosen 
message lies in box L. Shannon 11 showed, 
in a classic paper, that the information 
conveyed by sending a message that lies in 
box L is 7(L) = [ - ln P(L)]. The average 
information conveyed per message is 
< [-lnP(L)] > = -I L P(L)lnP(L), 
where < . . . > is the usual ensemble 
average. If the base-2, rather than natural, 
logarithm were used here, the information 
would be measured in bits. 

Now suppose each input presentation L 
generates a set of output values, denoted 
by the vector M, via some known compu¬ 
tation. Suppose that we are told the value 
of M—or, more strictly, which discrete 
box M lies in. (In general, M will not be 
uniquely determined by L because noise 
may be introduced in the computation of 
M.) How much additional information 
would we need to reconstruct the input 
message L that gave rise to M? (Shannon 
calls the ensemble average of this amount 
of additional information the equivo¬ 
cation.) 

The answer is 7 M (L) = [-ln P(L|M)], 
where P(L|M) is the conditional probabil¬ 
ity of the input message lying in box L 
given that the output lies in box M. There¬ 
fore the amount of information that 
knowing M conveys about L is the differ¬ 
ence, 7 (L) - 7 m (L) = ln [P(L | M)/P (L)]. 
The ensemble average of this quantity is 
the rate R, per message, of transmission of 
information from the cell’s inputs to its 
output. This is the average amount of 
information that knowing M conveys 


about L. We have 

R= <ln[P(L|M)/P(L)] > (16) 

We have a standard identity P(L|M) 
P(M) = P(L,M) = P(M|L)P(L), where 
P(L,M) is the joint probability that the 
input lies in box L and the output lies in 
box M. Using this gives 

R = < ln [P (M | L)/P (M)] > 

= -<lnP(M)> + <lnP(M|L)> 

= < 7(M) > - < 7 l (M) > (17) 

The right-hand side is the ensemble 
average of the total information conveyed 
by M, minus the information that M con¬ 
veys to one who already knows L. This sec¬ 
ond term is the “information” that M 
conveys about the processing noise, rather 
than about the signal L. 

Maximum information preservation. 

Let us now state the proposed principle of 
maximum information preservation for 
each layer, or processing stage, of a per¬ 
ceptual network: Given a layer L of cells, 
and the stationary ensemble statistical 
properties of the signal activity values in 
the layer, and given that layer L is to pro¬ 
vide input to another cell layer M, the 
transformation of activity values from L 
to M is to be chosen such that the rate R of 
information transmission from L to M is 
maximized, subject to constraints and/or 
additional cost terms. These constraints or 
costs may reflect, for example, biochemi¬ 
cal and anatomical limitations on the for¬ 
mation of connections, or on the character 
of the allowed transformations. 

The formulation of this principle arose 
from studying Hebb-type rules and recog¬ 
nizing certain optimization properties to 
which they lead for single M cells. Once 
formulated, however, the principle is 
independent of any particular local algo¬ 
rithm, whether Hebb-related or otherwise, 
that may be found to implement it. Let us 
explore 

(1) the consequences of the principle 
for some simple cases; 

(2) how the principle might be imple¬ 
mented; and 

(3) how it may fit within a broader view 
of neural development. 

A single M cell. Under certain condi¬ 
tions, maximizing the output activity var¬ 
iance of the M cell maximizes the Shannon 
information rate R. We illustrate this for 
a particularly simple but instructive case. 
The argument can be made somewhat 


more general than this, but it is not true 
that maximum information rate and max¬ 
imum activity variance coincide when the 
probability distribution of signal values is 
arbitrary. 

Suppose the M cell receives inputs from 
a set of L cells Lj, L 2 ,. . . , L v , and that 
the M cell’s output in the presence of 
processing noise has the form 

Ar = (Z,Z,?c,) + v" (18) 

Here n indexes the particular set of input 
and output values, so that if L is repeated 
but the output M is different, owing to 
noise, this counts as a different set of 
input-output values. The quantity v" is the 
noise, a random variable differing from 
one presentation to the next. Suppose that 

(1) M has a Gaussian distribution, 
with variance denoted by V; 

(2) v has a Gaussian distribution with a 
mean of zero and variance denoted 
by B; and 

(3) v is uncorrelated with any of the 
input components; that is, <v7,,> 
= 0 for all i. 

Then, omitting the details, we find that 
the information rate is 

7? = (l/2) In (V/B) (19) 

For a given noise variance, B, this rate 
is maximized by maximizing the output 
variance V of the M cell. Note that V/B is 
essentially a signal-to-noise ratio. 

Suppose that the noise model consists 
instead of independent Gaussian noise, v„ 
being introduced on each input line i, 
where each v, has variance B. Then 
A/" = 1,(7," + v")c,, and the informa¬ 
tion rate is found to be 7? = (1/2) \n[V/ 
(flic 2 ,)]. In this case, R is maximized for 
fixed B when (K/Zc 2 ) is maximized— 
that is, when the connection strengths are 
chosen so as to perform principal compo¬ 
nent analysis on the cell’s inputs. 

Redundancy and diversity. Suppose 
there is an arbitrary number of L cells but 
just two coupled linear M cells. Each M 
cell’s output is some linear combination of 
the L cell’s activities: 

M" = (I,T 1/ 7,") + v? (20) 

M5 = (I,t 2/ L?) + v? (21) 

Each noise term is Gaussian and of var¬ 
iance B, the noise terms for the two M cells 
are uncorrelated with each other, and each 
noise term is uncorrelated with any of the 
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L cell activities. We treat the case in which 
and M 2 have Gaussian distributions, 
with <M? > = <M\ > =0. Our task is 
to determine what values of the t ni ' s lead 
to the maximum information being 
preserved during the processing of L-cell 
activities to give M-cell output activities. 

Note that the t ni ’s do not in general 
stand for the strengths of particular con¬ 
nections. There may be both feedforward 
and lateral (M-to-M) connections whose 
joint effect, possibly over several time 
steps, is to produce the M-cell outputs of 
Equations 20 and 21. Our concern here is 
not with the particular connection 
strengths, nor with the development rule 
that may implement them, such as a Hebb- 
type rule, but rather with understanding 
what cell response properties—what t ni 
values—are induced by the principle of 
maximum information preservation. 

Omitting details of the proof, the result¬ 
ing information rate for this case is 

R = (1/2) ln(Det Q“) - In B (22) 

where the elements of the 2 x 2 covariance 
matrix Q M are Q™ m = <M n „ M n m > and 
“Det” denotes the determinant. We find 

Det Q M =B 2 + B(W i + W 2 ) 

+ fV l JV 2 (l-eh) (23) 

where W n is the output variance of cell M„ 
in the absence of noise, and g 12 is the 
correlation coefficient of the activities of 
M cells 1 and 2, also in the absence of 
noise. 

To maximize R, given B, we must max¬ 
imize Det Q m . When B is large, the third 
term on the right-hand side of Equation 
23, which is independent of B, is small 
compared with the second term, which is 
of order B. In that case, maximizing Det 
Q m means maximizing (W t + W 2 ). If no 
constraint prevents us, we can achieve this 
maximization by maximizing W, and W 2 
separately. But this means constructing 
each M cell so that its output variance, 
which is W n in the absence of noise, or 
W n + B in the presence of noise, is max¬ 
imized. This is exactly what we found to be 
the optimum solution when there is only 
one M cell. (See Equation 19.) 

If the noise B is smaller, then the third 
term becomes relatively more important. 
The rate R is then maximized by making 
an optimal tradeoff between keeping W, 
and W 2 large, and making the responses 
of the two M cells uncorrelated. 

We have thus found that, depending 
upon the noise level, there is competition 
between the value of having redundant M 


cell responses, which mitigate the 
information-destroying effects of noise, 
and the informational value of having 
different cells extract different linear com¬ 
binations of the input. A high noise level 
favors redundancy. In this case, both M 
cells compute the same linear combination 
of inputs, if there is only one such combi¬ 
nation that yields maximum output 
activity variance. A lower noise level 
favors diversity of response. In this case, 
the M cells compute different linear com¬ 
binations of the L cell activities, even 
though each M cell’s output variance may 
be reduced as a result of this choice. 

To make this more concrete, consider a 
simple example. There are two L cells, and 
the Q matrix for L cell activity has 
Gii = Q 22=1 and Qi 2 = (>2i=<7 with 
0<q< 1. We arbitrarily impose the con¬ 
straint that t 2 \ + t 2 n2 = I for each M cell 
(* = 1 , 2 ). 

The solution that maximizes the pres¬ 
ervation of information then has t n =t 22 
and t i2 -t 2 i, and the values of t n and t 12 
are given in Figure 5 as a function of B and 
q. For large B, both M cells receive the 
same linear combination of inputs: 
(L\+L2)/\f2. For smaller B, the cells 
measure different linear combinations of 
Li and L 2 . In the limit as B approaches 
zero, one M cell receives input only from 
cell Li and the other only from L 2 . 

A layer of M cells with nonlinearity and 
lateral connections. What does the princi¬ 
ple of maximum information preserva¬ 
tion, which we shall call the infomax 
principle, imply qualitatively in this more 
general case? Maximizing R means that we 
attempt to (1) maximize the total informa¬ 
tion conveyed by the output message M, 
and (2) minimize the information that M 
conveys to one who already knows the 
input message L. These criteria are related, 
but not equivalent, to the property of 
encoding signals so as to reduce redundan¬ 
cies present among the inputs to the per¬ 
ceptual system. The general idea that 
information theory can be useful for 
understanding perception is an old one. 
Significant contributions were made by 
Attneave in 1954, Barlow in the 1950s and 
1960s, and Marr in 1970. Much of this 
work has focused on the role of redun¬ 
dancy reduction. This property is one, but 
only one, aspect of the infomax principle. 
For example, we have seen that infomax 
also leads to the introduction of redun¬ 
dancy when this is useful in countering the 
effects of noise. 

I have analyzed the qualitative conse¬ 


quences of the infomax principle in some 
very simple models. 12 The results show 
that the principle can, under certain con¬ 
ditions, lead to L-to-M transformations 
with the following properties: 

• Topographic mapping from layer L 
to layer M, when the spatial extent of 
lateral connections within layer M is 
assumed to be limited. That is, near¬ 
neighbors in L tend to map to near¬ 
neighbors in M. 

• Map distortions, in which a greater 
number of M cells tend to represent 
the types of layer-L excitation pat¬ 
terns that occur more often. 

• The infomax principle selects which 
features of the input signals are rep¬ 
resented in layer M. Features having 
relatively high signal-to-noise ratios 
are favored. This is the extension of 
our previous redundancy-diversity 
result to the full-layer case. 

• Orientation-selective cells, and the 
arrangement of such cells in orienta¬ 
tion columns, can emerge for some 
very simple types of model input. 

• When time-delayed information is 
made available to the layer, the info¬ 
max principle can cause M cells to 
extract and encode temporal correla¬ 
tions, in a manner similar to the 
extraction of spatial correlations. 

I must emphasize that much work is 
required to determine the consequences of 
the infomax principle for cases involving 
more biologically realistic patterns of 
activity. 


Discussion 

From a simplified set of assumptions—a 
linear summation response, a simple 
Hebb-type rule having a covariance form, 
and feedforward connections only—we 
derived an optimization principle for the 
development of a single cell. This princi¬ 
ple states that the mature M cell is such that 
its output activity variance is maximized 
subject to constraints. More generally, we 
can have cost terms instead of, or in addi¬ 
tion to, constraints. Then the function 
maximized involves both the variance and 
the additional cost function. 

This led us to infer a proposed principle 
of maximum information preservation, 
subject to constraints. It is equivalent to 
variance maximization in some simple 
cases, but it has a much broader scope. For 
example, it can be applied to cases in which 
a layer of L cells provides input to an entire 
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Figure 5. Values of the coefficients <„,- that maximize the preservation of informa¬ 
tion from layer L to M, for a simple case with two L and two M cells. Each M cell 
output (see Equations 20 and 21) includes random noise having variance B, and q is 
the correlation, or covariance, of the activities of the two L cells. For 
x=Bq/(\-q l )>\, both M cells redundantly compute the same linear combination 
of the L cell activities (all <„,- = 1/V2). For x< 1, the optimal < values satisfy < ir = 

<22 and < 12 = <21* where the upper curve gives < n and the lower curve gives < 12 , or the 
reverse. The curves for x< 1 are given by y = (1/2)[(1 + x) 1/2 ±(l - x) l,1 \\ this is de¬ 
rived by maximizing Det Q M . (See Equation 23.) 


layer of M cells, with lateral as well as feed- [ 
forward connections. It can likewise apply I 
to cases in which the response function is 
not necessarily linear. 

The consequences of this proposed prin¬ 
ciple are only beginning to be explored. 
One set of issues that needs clarification is 
the choice of biologically appropriate con¬ 
straints and cost terms. A second, related, 
issue involves the choice of algorithms, 
whether of Hebb type or otherwise, that 
control the development of feedforward 
and lateral connections so as to implement 
the optimization principle. While much 
work needs to be done, I suggest that this 
principle, or something like it, may play an 
important role in determining the charac¬ 
ter of perceptual processing at least in its 
early stages, where there is a chance that 
feedback influences may not affect the 
development of feature-analyzing func¬ 
tion in an essential way. Possibly the prin¬ 
ciple may play some role even in the 
presence of significant feedback, but it is 
not clear at this time how best to analyze 
this case. 

What might we expect to be the charac¬ 
ter of a layer of cells developing according 
'to the principle of maximum information 
preservation, for cases of biological 
interest? Although the necessary calcula¬ 
tions for sufficiently realistic cases have 
not yet been carried out, we can speculate 
on the outcome. 

Suppose there is a constraint on the dis¬ 
tance within layer M over which the 
activity of one M cell can affect another. 
There might be, for example, a constraint 
on the length of lateral connections. Sup¬ 
pose also that each region of layer M 
“sees”, or receives input from, only a 
limited region of layer L, and that nearby 
regions of M “see” nearby regions of L. 
Then, if the noise variance B is large, and 
there are not many M cells that “see” the 
same set of L cells, we might find that each 
M cell develops so as to maximize its 
activity variance, and performs processing 
that is redundant with that of many of its 
neighbors. 

On the other hand, if B is smaller, or 
there are a large number of M cells that 
“see” the same L region, we may expect 
that the M cells in a region do not all per¬ 
form the same processing function on the 
inputs from the L layer. Instead, they 
might span a range of feature-analyzing 
properties, each of which has a moderately 
high variance. 

In the visual system of cats and mon¬ 
keys, there are multiple layers of center- 
surround cells, followed by layers of 


orientation-selective cells. The orientation- 
selective cells begin at a different layer in 
cats than in monkeys. It is possible that, in 
response to the ensemble of inputs seen by 
a particular layer, the layer can develop 
either center-surround or orientation- 
selective cells, as occurred in our previous 
model simulations. 1 Perhaps a parameter 
such as the noise level B “tunes” for 
redundancy or diversity of response. 
Redundancy could favor center-surround 
cell formation, with many cells perform¬ 
ing substantially the same processing func¬ 
tion. Diversity could favor the formation 
of orientation-selective cells spanning the 
entire range of orientation preferences 
within each region of the layer. (Of a group 
of cells comprising all orientation prefer¬ 
ences, only a small fraction will fire when 
presented with an oriented edge of illumi¬ 
nation.) Hubei and Wiesel discovered 3,4 
that orientation-selective cells are 
arranged, within a cortical layer, so that 
each small region of cortex (=1 x 1 mil¬ 


limeter) contains the “machinery” for 
analyzing substantially all edge orienta¬ 
tions seen by either eye within a small 
region of visual space. Perhaps the prin¬ 
ciple of maximum information preserva¬ 
tion, combined with limits on lateral 
interaction distance, can account for this 
efficient organization. 

Local algorithms. The infomax princi¬ 
ple is stated in terms of maximizing a com¬ 
plicated expression (see Equation 16). Is 
there an algorithm or process that deals 
with much simpler quantities and 
computations—local to each cell or pair of 
connected cells in a network—and yet 
implements the infomax principle, at least 
approximately? 

I have found 12 that, for some simple 
cases, a Hebb-related algorithm developed 
by Kohonen 13 implements some of the 
qualitative features required by the info¬ 
max principle. This algorithm was devel¬ 
oped to show how lateral connections can 
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induce topographic order in a simple 
model, and makes no reference to noise or 
information content. These results suggest 
that it may be possible to devise a local 
algorithm that more fully embodies the 
requirements of the infomax principle. 

The relationship between the principle 
and such an algorithm would be com¬ 
plementary. The principle would suggest 
what the function of the algorithm, and 
the lateral connections it describes, might 
be—that is, what role the processes and 
connections might serve in the construc¬ 
tion of a perceptual system. The algorithm 
would show how a complex optimization 
principle could be implemented by a net¬ 
work of cells that individually have little 
computational power. 

Although I have focused on algorithms 
that perform activity-dependent modifica¬ 
tion of connections, other types of 
mechanisms may be used to implement a 
given optimization principle. Biochemical 
cell-cell adhesion markers, chemical or 
other gradients that may help to establish 
topographic maps, particular cell types 
that implement complex types of connec¬ 
tivity (as in the retina), and other mechan¬ 
isms may all play a role. An organizing 
principle by itself does not determine the 
many design details that a particular 
system—biological or synthetic—may use 
to implement it. 

Infomax and perceptual data. Why 

might it be important for a perceptual sys¬ 
tem to maximize the amount of informa¬ 
tion preserved from one layer to the next? 

Presumably, one goal of a perceptual 
system is to provide the brain with the 
means of discriminating different environ¬ 
mental situations that may demand differ¬ 
ent responses by the animal. 

For a very simple network with only a 
couple of layers of processing from 
environmental input to motor output, we 
could imagine using some sort of super¬ 
vised learning mechanism. The mecha¬ 
nism would pair inputs with the desired 
output responses and adjust the connec¬ 
tion strengths accordingly. Such a process 
involving more than a few layers, however, 
appears biologically implausible, and its 
performance may scale poorly as the num¬ 
ber of layers is increased. 

In a complex network, or in an animal’s 
brain, it is totally unclear how a compo¬ 
nent layer is to “decide” what transforma¬ 
tion its connections should perform —if we 
assume that the layer needs to “know” 
what environmental features are impor¬ 
tant for the animal to respond to. This is 


the classic artificial intelligence credit 
assignment problem: if the final output 
from a complex system is correct, which 
connections should be rewarded or 
strengthened? 

The approach we propose avoids this 
problem. Instead of requiring that a con¬ 
nection or layer “know about” the ulti¬ 
mate goals of the animal, we use only local 
information. The information that reaches 
a layer is processed so that the maximum 
amount of information is preserved. We 
have seen that this does not in general lead 
to a trivial one-to-one identity mapping, in 
which each M cell receives input from only 
one L cell. In general, the identity mapping 
is not a solution that maximally preserves 
information, owing to the role of noise in 
our model. Instead, each M cell tends to 
respond to features that are statistically 
and information-theoretically most signif¬ 
icant, in a sense similar to that of principal 
component analysis. Applying the princi¬ 
ple of maximum information preservation 
to each layer of processing in turn, results 
in the emergence of a sequence of feature¬ 
analyzing functions. 

The following analogy may help you to 
see intuitively how the process works. 
Imagine a person in an organization, 
whose job is to make the most informative 
possible summary of the data that he 
receives each week. The type of data he 
receives depends upon the environment 
external to the organization, the structure 
of the organization (what “layer” he is 
part of), and various constraints. Over 
time, he finds that a particular represen¬ 
tation of information—for example, 
graphical plots involving various 
variables—serves him best in preparing his 
summary. If he is allowed to interact with 
others in his “layer”, the criterion can be 
broadened (as we did for the cells) to state 
that the composite output of his layer 
should be as informative as possible. 

Note that some set of processing func¬ 
tions will end up being provided by this 
person’s “layer”, without the workers 
needing to know either what the goals of 
the entire organization are, or what infor¬ 
mation is deemed most important by their 
superiors in later “layers”. 

In both the organizational analogy and 
the real network, there is no need for any 
higher layer to attempt to reconstruct the 
raw data from the summary. The point is 
rather to enable the higher layers to use 
environmental information to dis¬ 
criminate the relative value of different 
actions. If the needed information has 
been lost at intermediate stages, it cannot 


be used. If a local optimization principle 
is to be used—one that does not attempt to 
take account of remote high-level goals— 
then we do not know what particular 
information is going to be needed at high 
levels. Since we don’t know what informa¬ 
tion we can afford to discard, it is reasona¬ 
ble to preserve as much information as 
possible within the imposed constraints. 
The principle of maximum information 
preservation thus appears to be an 
extremely natural and attractive one to use 
in the construction of a layered perceptual 
system. 

Evolution and infomax. The infomax 
principle may determine what transforma¬ 
tion each layer of a given network will 
implement. However, it does not specify 
the “gross architecture” of the network; 
that is, which layers provide input to which 
other layers. Nor does it specify the vari¬ 
ous parameters that may affect layer devel¬ 
opment, such as noise level, the allowed 
range of lateral connections, and so on. 
These aspects of the design may be deter¬ 
mined by biological evolution, or by other 
principles not yet identified. 

For an analogy, think of an electronic 
circuit designer who is not free to modify 
the properties of the components he or she 
uses, but who can connect them to form a 
variety of circuits. In the case of our pro¬ 
posed principle, each “component” is an 
entire cell layer, and the infomax principle 
determines that layer’s behavior given a 
particular gross architecture or “circuit 
design”. Thus evolution can “close the 
loop’ ’ on the design process, favoring the 
survival of organisms whose perceptual 
systems are well-adapted to their envi¬ 
ronment. 

There is a separate and important evolu¬ 
tionary function that a generic principle 
for the development of a perceptual net¬ 
work layer—whether it be infomax or 
some other principle—can serve. Suppose 
that an evolutionary mutation produces a 
modified eye, or merges auditory signals 
into the visual pathway at some new point. 
If there were no generic principle for layer 
development, we might imagine that 
mutations would have to occur simultane¬ 
ously in the processing function of several 
layers, for those layers to be able to use the 
novel input properly. But if there is such 
a generic principle—one that applies to 
each layer regardless of what type of input 
reaches it—then the novel input will auto¬ 
matically be processed in accordance with 
that principle. This suggests that the exis¬ 
tence of a generic principle may greatly 
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increase the likelihood of a mutation being 
adaptive. 

A broader context. Other complex sys¬ 
tems, besides neural networks, pose 
challenges similar to those we have dis¬ 
cussed. How might complex structures 
and behaviors that may appear goal- 
oriented emerge from relatively simple 
local rules? We have seen that a local 
dynamical rule of Hebb type, acting at 
synapses, leads to an optimization 
principle—variance maximization—at the 
level of the whole cell. This suggested an 
optimization principle—maximum infor¬ 
mation preservation—that may apply at 
the level of an entire layer. From the stand¬ 
point of information theory, we may find 
that the immune response system and bio¬ 
logical evolution, among other complex 
systems, have certain abstract similarities 
to the process of neural development and 
plasticity, although the dynamical rules 
and the substrates upon which they act are 
quite different. 

A great deal of work remains to be 
done, if we are to take this or 
some other proposed organizing 
principle, extract testable predictions from 


it, and determine its scope and limitations. 
We need to identify and test such princi¬ 
ples, in order to complement and help to 
focus the enormous amount of detail being 
revealed by progress in experimental neu¬ 
roscience. The study of such principles 
may also provide the understanding 
needed to develop synthetic perceptual sys¬ 
tems that require no explicit pro¬ 
gramming. □ 
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ADVANCE ANNOUNCEMENT 


The theme for this year's symposium is "Integrated Networking and Open Architecture." The symposium 
will highlight topics related to the design, modeling, protocols, performance, and implementation of 
current and soon to be available network systems. 

Three tutorial sessions are offered on April 11th to provide in-depth instruction in the following areas: 

■ Network Security 

■ MAP/TOP/GOSIP—Open Systems for the Real World 

■ The Upper Three Layers—Putting It All Together 

Papers dealing with the following topics will be presented on April 12th: 

■ Analysis and Optimization in Distributed Networks ■ Network Security 

■ OSI Protocols ■ Congestion Control and Loan Balancing 

■ Issues in Multi Access System ■ Protocol Specification and Verification 

Papers dealing with the following topics will be presented on April 13th: 

■ Resource Allocation in Integrated Networks ■ Distributed System 

■ Protocol Implementations ■ Network Management 

■ Routing ■ Integrated Networking Issues 

The symposium will also include a panel discussion entitled "National Data Network." 

computer 

networking 

symposium 

April 11-13, 1988 ■ Sheraton National Hotel ■ Washington, DC Area 


REGISTER TODAY!! For additional information, please contact: Conference Department, 

The Computer Society of IEEE, 1730 Massachusetts Avenue, N.W., Washington, DC 20036, (202) 371-1013. 
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Project launched for application layer service definition 


Accredited Standards Committee X3 
has approved work on a new project to 
develop an American national standard 
for application layer service definition 
supported by presentation connection¬ 
less mode transmission. The actual 
development work will be carried out by 
Technical Committee X3T5, Open Sys¬ 
tems Interconnection (OSI). 

The goals of X3T5 are to contribute 
to the international effort on the defini¬ 
tion of application layer service sup¬ 
ported by presentation connectionless 

NBS completes first part 

Researchers at the National Bureau 
of Standards have completed the first 
part of a project to build gateways 
between US Defense Dept, communica¬ 
tion protocols and Open Systems Inter¬ 
connection (OSI) networking protocols. 
The project was sponsored by the US 
Defense Communications Agency 
(DC A). 

OSI standards enable the equipment 
and systems of different manufacturers 
to communicate with each other through 
networks. The gateways will be used to 
maintain the agency’s operations during 


A meeting will be conducted March 
24 in Washington, DC, to discuss the 
standardization of DES encryption on 
802.3 LANs. Sponsored by the IEEE 
Technical Committee on Security and 
Privacy, the meeting will be held a day 
after completion of Compstan 88, the 
three-day 1988 Conference on Com¬ 
puter Standards. 

All those in the industry interested in 
discussing and participating in the 
development of this standard are 
encouraged to attend. 

Several vendors are developing 
products to provide secure communica¬ 
tions on 802.3 LANs. The products are 
similar, although there are enough 
differences that a secure multi-vendor 
802.3 LAN is not possible. As vendors 
know, customers buy from many 
sources and expect to effect communi¬ 
cations among the various products. All 
customers need security, from the most 


mode transmission, and to adopt an 
American national standard upon 
acceptance as an international standard. 

This work will be limited to the defi¬ 
nition of the application service provided 
by the application layer of the OSI 
reference model supported by presenta¬ 
tion connectionless mode transmission. 
It is also applicable to systems requiring 
a connectionless mode of operation to 
satisfy certain specific application 
requirements. 

Since X3T5 intends to complete this 


conversion to OSI networks. 

Working with a guest scientist from 
IBM, the researchers in the NBS Insti¬ 
tute for Computer Sciences and Tech¬ 
nology have developed gateways and a 
test system for linking the Defense 
Dept, electronic mail protocol (Simple 
Mail Transfer Protocol) to its OSI 
counterpart (Message Handling Facil- 
ity/X.400). The automated test system 
is used to determine whether products 
comply with the gateway specifications. 

Specifications are still being devel¬ 
oped for a second gateway to connect 


stringent government requirements for 
secrecy to the less stringent, but equally 
important, commercial requirements 
for privacy. Thus, using standard secu¬ 
rity mechanisms would be advantageous 
in marketing any LAN product. 

The primary discussion at the meet¬ 
ing will focus on determining if a stan¬ 
dard is necessary and defining the scope 
of the standard activities. Two areas 
will be prepared for discussion: encryp¬ 
tion and key management. 

Topics to be addressed in the area of 
encryption will involve the need for a 
common encryption algorithm (DES is 
a prime candidate); the standardization 
of the mode of operation (that is, 
block, cyphertext feedback); the estab¬ 
lishment of a common key length for 
the encryption algorithm and the public 
key; and the definition of which fields 
should be encrypted. 

Topics to be addressed in relation to 


draft American national standard upon 
its acceptance as an international stan¬ 
dard, interested participants and users 
are encouraged to become involved in 
the process as early as possible to 
influence the international work that 
will be undertaken later. 

To join X3T5 to work on this project 
or any other project under way, contact 
the X3T5 chair, Jerrold S. Foley, Elec¬ 
tronic Data Systems Corp., 300 E. Big 
Beaver Rd., PO Box 7019, CUBE 5176, 
Troy, MI 48083; (313) 524-8416. 


the military’s File Transfer Protocol 
and OSI’s File Transfer, Access, and 
Management Protocol. NBS expects to 
complete work on the second gateway, 
including a test system, by the end of 
this year. 

When completed, these specifications 
and testing procedures will be available 
to computer and communications ven¬ 
dors as well as to test service organiza¬ 
tions for use in developing gateway 
products that link Defense Dept, and 
OSI protocols. 

discussed at DC meeting 

key management relate to the need for a 
common key management scheme, and 
provisions for broadcast and multicast 
packets. 

Discussion about the key manage¬ 
ment scheme will span the use of 
schemes that are centralized, dis¬ 
tributed, or a combination of both. A 
distributed system would most likely 
employ a public key system for creating 
session keys without involving a third 
party in establishing the connection. A 
centralized system would require an 
online key management system (third 
party) to monitor and authorize all con¬ 
nection attempts across the network. A 
hybrid system containing both dis¬ 
tributed and centralized key manage¬ 
ment concepts will also be considered. 

For more information on this 
activity, contact Kim Kirkpatrick at 
(617) 271-7555. 


Standards for DES encryption on 802.3 LANs to be 


of Defense Communications Agency project 
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IEEE Board approves Computer Society standards activities 


The IEEE Standards Board approved 
the following new Computer Society 
projects at its December 1987 meeting: 

• P1003.0, Guide to POSIX Based 
Open System Architecture 

• PI003.5, Ada Language Binding 
for POSIX 

• P1003.6, Security Interface Stan¬ 
dards for POSIX 

• P1016.2, Guide to Software Design 
Descriptions 

• PI 163, A Standard Interface for 
IEEE Standard 1076-1987 to Com¬ 
puter Aided Design and Manufac¬ 
turing (CAD/CAM) tools 


The Massachusetts Institute of Tech¬ 
nology and several leading computer 
companies have formed the MIT X 
Consortium. The objective is to support 
and further develop the X Window Sys¬ 
tem, an industry standard graphics win¬ 
dowing system for workstations in 
multitasking, networked computing 
environments. 

The system has been adopted by soft¬ 
ware and hardware developers as an 
industry standard method for simul¬ 
taneously displaying multiple applica¬ 
tions on a computer screen, according 
to MIT. With it, workstation users in a 
multivendor network can simultane¬ 
ously access a wide array of computing 
resources, such as supercomputers or 
specialized processors, through win- 


The X3V1 technical committee on 
text processing, office, and publishing 
systems standards is working on pro¬ 
tocols for distributed office applications 
and other text interchange activities. 
Persons with expertise or general 
interest in this area are invited to attend 
the committee’s meetings. 

Distributed office applications are 
characterized by the availability of a 
diverse array of electronic office equip¬ 
ment and enabling software, dispersed 
geographically but capable of acting 
cooperatively by means of communica¬ 
tions media and techniques. Facilitating 
the homogeneous interconnection 
between these heterogeneous offerings 


• PI 164, Recommended Practice for 
Coding in IEEE Std. 1076-1987 

• PI 165, Recommended Practice for 
the Interrelationships Between 
IEEE 1076 VHDL and EIA RS44 
EDIF 

• PI 173, Standard Set of Functional 
Components of a Simulation 
Language 

The board also approved six new 
standards: 

• 802.3D, LAN: Fiber Optic Inter- 
Repeater Link 

• 961, Eight Bit Micro Bus System 


form 
standard 

dows on the workstation screen. 

The consortium, open to organiza¬ 
tions dedicated to promoting the evolu¬ 
tion of the X standard, is responsible 
for maintaining the documentation of 
the major components of the X system 
and a software validation suite for 
server implementations and client inter¬ 
faces. It will also fund systems enhance¬ 
ments, such as 3D graphic extensions 
and libraries, video extensions, and pro¬ 
tocol binding, and toolkits for languages 
other than C. 

The founding consortium members 
include Apollo Computer, Apple Com¬ 
puter, AT&T, CalComp, Digital Equip¬ 
ment Corp., Hewlett-Packard, Sequent 
Computer Systems, Sony, Sun 
Microsystems, Tektronix, and Xerox. 


is extremely important to realize their 
full benefits. 

X3V1 is currently concentrating on 
developing a general model for dis¬ 
tributed office application protocols, 
standards for print service access, and 
document filing and retrieval. 

For further information, contact the 
chair of the distributed office applica¬ 
tions task group, Robert Christie, Con¬ 
trol Data Corp., 4201 N. Lexington 
Ave., St. Paul, MN 55126-6198; (612) 
482-6689. 

For more information on the X3V1 
committee, contact its chair, L.M. Col¬ 
lins, IBM Corp., 200 Las Colinas Blvd. 
Irving, TX 75039; (214) 556-4390. 


• 1000, Eight Bit Backplane Interface 

• 1058.1, Software Project Manage¬ 
ment Plan 

• 1063, Software User Documen¬ 
tation 

• 1076, Description Language for 
Electronic Hardware 

For more information on current 
IEEE projects, or assistance in purchas¬ 
ing copies of draft or final standards, 
contact the IEEE Computer Standards 
Secretariat, 1730 Massachusetts Ave. 
NW, Washington, DC 20036-1903; 
(202)371-0101. 


Standards Briefs 

Posix goes international 

The joint ISO/IEC international 
standards committee, known as JTC1, 
has approved the establishment of an 
international working group on POSIX. 

Headed by the Computer Society’s 
Jim Isaak, chair of the P1003 POSIX 
Subcommittee, this new working group 
will coordinate international review and 
comment on the IEEE Trial Use Stan¬ 
dard for Portable Operating Systems 
Environments (POSIX). 

For information on POSIX stan¬ 
dardization activities, contact the IEEE 
Computer Standards Secretariat, 1730 
Massachusetts Ave. NW, Washington, 
DC 20036-1903; (202) 371-0101. 


IEEE Standards Office slates 
computer software seminars in 
San Diego, London 

The IEEE Standards Office will con¬ 
duct a series of computer software semi¬ 
nars in April, May, and June in San 
Diego, Calif., and London, England. 

“Software Verification & Validation” 
and “Software Quality Assurance” will 
be offered April 18-20 in San Diego and 
May 30-June 1 in London. 

“Software Testing” and “Software 
Configuration Management” will be 
offered April 21-22 in San Diego and 
June 2-3 in London. 

In addition, “Software Requirement 
Specification” will be offered in San 
Diego April 21-22. 

For additional information on the 
seminars, contact Seminar Manager, 
Standards Office, IEEE Headquarters, 
345 East 47th St., New York, NY 
10017-2369; (212) 705-7759. 


MIT, leading workstation vendors 
consortium to support windowing 


Text processing, office, and publishing 
systems group seeks participants 
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Call for Papers, Participants, and Exhibitors 
The 2nd Symposium on the 


FRONTIERS 
18 


Frontiers of Massively Parallel Computation 


October 10-12,1988 
George Mason University 
Fairfax, Virginia 

Programming Languages 
Architectures 
Algorithm Development 
Graph Theory 
Image Processing 
Numerical Modeling 
Data Base Management 
Interconnection Networks 
Hierarchical Structures 
Neural Networks 
New Technologies 



PROGRAM COMMITTEE 

Prof. David Schaefer, Chairman 
George Mason University 

Dr. Ray Arnold 
NASA Headquarters 

Dr. Ken Batcher 
LORAL Defense Systems-Akron 

Dr. Jack Dongarra 
Argonne National Laboratory 

Prof. Michael Duff 
University College, London 

Dr. Milton Halem 
NASA/Goddard Space Flight Center 

Dr. James Hardy 
Whitney/Demos Productions 

Prof. Dennis Parkinson 
Active Memory Technology 


Prof. Tomaso Pogglo 
Massachusetts Institute of Technology 


IMPORTANT DATES 


Prof. John Reif 
Duke University 


Abstracts due: 
Acceptances sent: 
Registrations due: 
Tutorials: 

Sessions and Exhibits: 
Manuscripts due: 


April 15,1988 
June 15,1988 
September 15,1988 
October 10.1988 
October 11-12,1988 
October 12.1988 


CONTACT INFORMATION 

To receive an advance program 
with registration information, send 
name, address, and phone 
number to: 

Frontiers '88 Symposium 
P.O. Box 334 

Greenbelt, Maryland 20770-0334 


Submit abstracts to: 

Prof. David H. Schaefer 
Department of Electrical and 
Computer Engineering 
George Mason University 
4400 University Drive 
Fairfax, Virginia 22030 


To present an exhibit: 

Mr. James R. Fischer 

Chairman, Frontiers '88 

Image Analysis Facility. Code 635 

NASA/Goddard Space Flight Center 

Greenbelt, Maryland 20771 

(301)286-3464 


Prof. Anthony Reeves 
Unlversty of Illinois 

Prof. Azriel Rosenfeld 
Unlversty of Maryland 

Dr. Paul Schneck 
Supercomputing Research Center 

Dr. Steven Squires 
DARPA 

Dr. Guy Steele 
Thinking Machines Corporation 

Prof. Leonard Uhr 
University of Wisconsin 


BACKGROUND AND SUBMISSION INFORMATION 

The Frontiers '88 Symposium will focus on the increasing importance of massively parallel computer systems and data parallel pro¬ 
gramming techniques. Applications, theory, and architectures for computers with more than 1.000 processors will be discussed by 
users, scientists, and developers from industry, academia, and Government. Selected papers will be published in a dedicated is¬ 
sue of the Journal of Parallel and Distributed Computing. Tutorials will survey the concepts embodied in massively parallel systems 
and broad applications areas. Exhibits will feature commercial and prototype massively parallel systems, relevant journals, and 
newsletters. 

Those interested in presenting a paper must submit a three-page abstract to the program chairman no later than April 15,1988. 
Complete names, addresses and telephone numbers for all authors (beginning with primary author) must accompany the ab¬ 
stract. Letters of acceptance and guidelines for preparing a final paper will be sent to authors by June 15,1988. Authors who prefer 
to present their paper as part of a poster session should so indicate. The program chairman reserves the right to assign authors to a 
poster session in the event of greater than expected response. 


Sponsored by George Mason University, ^Computer Society of the IEEE (pending), IEEE-National Capital Area Council, and NASA 
Supported by grants from Active Memory Technology, LORAL Defense Systems-Akron, Martin Marietta Aerospace, Science Applications Research, and Thinking Machines Corporation 
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Society’s handbook distributed to members 


A handbook describing the varied 
professional resources and benefits of 
Computer Society membership has been 
mailed to the 90,000 members of the 
society. The pamphlet was designed to 
serve as a reference to the services and 
programs available to current members 
as well as to potential members. 

The handbook describes the society’s 
organization and offices; periodicals 
(like Computer) and Computer Society 
Press offerings; the conference, awards 
and ombudsman programs; chapter, 
educational, standards, and student 
activities; the purpose and scope of each 
of the society’s 33 active technical com¬ 
mittees plus the Committee on Public 
Policy; electronic mail services; and the 
various ways participation in the society 
can help members enjoy and enhance 
their careers. 

The booklet outlines the history of 
the society and the Institute of Electri¬ 
cal and Electronics Engineers, tracing 
their origins to 1884 when the American 
Institute of Electrical and Electronics 
Engineers was formed. In 1963, the 


AIEE merged with the Institute of 
Radio Engineers (which itself had been 
in existence for half a century) and the 
new organization adopted the IEEE as 
its name. With some 300,000 current 
members, the IEEE has become the 
world’s largest technical professional 
organization in terms of scope and size. 

The Computer Society, the largest of 
the IEEE’s 33 technical units, evolved 
with formation of the IEEE from com¬ 
puter groups within the two parent 
organizations. From these beginnings, 
the CS has grown to include a large and 
diverse membership united by a com¬ 
mon interest: advancing the theory and 
practice of computer science and 
engineering. The organization brings 
together elected officials, volunteers, 
and society staff to serve the member¬ 
ship in promoting cooperation and the 
exchange of technical information. 

If you did not receive your copy of 
the handbook or want to obtain addi¬ 
tional copies for yourself or colleagues, 
circle 188 on the Reader Service Card. 


The Computer Society Organization 
Board of Governors 


President 



1988 TC chairs 

Computational Medicine 

John Long 

University of Minnesota 
Suite 408 

2829 University Ave. SE 
Minneapolis, MN 55414 
Off.: (612) 627-4850 
Res.: (612)942-0140 
Compmaii + j.long 

Computer Architecture 

Roger Anderson 

Lawrence Livermore Laboratory 

L-306 

PO Box 808 
Livermore, CA 94550 
Off.: (415) 422-4239 

Computer Communications 

William Livingston 
Vance Systems 
3901U Bonanza Blvd. 

Chantilly, VA 22021 
Off.: (703) 471-9402 
Res.: (703)624-3326 
Compmaii + w.livingston 

Computer Elements 
Ronald Bell 
Sperry 

322 N. Sperry Way 
Salt Lake City, UT 84116 
Off.: (801) 594-5386 

Computer Graphics 

Lawrence Rosenblum 
Naval Research Laboratory 
Code 5810 

4555 Overlook Ave. SW 
Washington, DC 20375 
Off.: (202) 767-3743 
Res.: (301) 424-5762 
Compmaii + 1.rosenblum 

Computer Languages 

Pei Hsia 

University of Texas/Arlington 
Computer Science 
2100 Oak Bluff Dr. 

Arlington, TX 76001 
Off.: (817) 273-3785 
Res.: (817) 275-8764 
Compmaii + p.hsia 

Computer Packaging 

Martin Freedman 

IBM T. J. Watson Research Center 

13-209 

PO Box 218 

Yorktown Heights, NY 10598 
Off.: (914) 945-1711 
Res.: (203) 322-9570 

Computers in Education 

Doris Carey 
University of Nevada 
4505 Maryland Parkway 
Las Vegas, NV 89154 
Off.: (702) 739-3860 
Res.: (702) 641-7090 
Compmaii + d.carey 
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Computing and the Handicapped 

Elmer Hoyer 
Wichita State University 
Box 44 

Dept, of Electrical Engineering 
Wichita, KS 67208 
Off.: (316) 689-3415 
Res.: (316) 744-1929 

Database Engineering 

Sushil Jajodia 

National Science Foundation 
Rm. 310 

Washington, DC 20550 
Off.: (202) 357-9572 
Res.: (703)243-0471 
£ Compmail+ s.jajodia 

Design Automation 

Sumit DasGupta 
IBM, East Fishkill Facility 
Zip 3A1 
Route 52 

Hopewell Junction, NY 12533 
Off.: (914) 894-0540 
Res.: (914) 297-1946 
Compmail+ s.dasgupta 

Display Ergonomics 
James Greeson, Jr. 

IBM 
H-29-061 
PO Box 12195 

Research Triangle Pk., NC 27709 
Off.: (919) 543-6655 
Res.: (919) 787-3910 
Compmail + j. greeson 

Distributed Processing 
Walter Kohler 
University of Massachusetts 
Dept, of ECE, Marcus Hall 
Amherst, MA 01003 
Off.: (413) 545-0765 
Compmail + w.kohler 

Fault-Tolerant Computing 

David Rennels 
Computer Science Dept. 

4731 Boelter Hall 
University of California 
Los Angeles, CA 90024 
Off.: (213) 825-4033 
cs.ucla.arpa 

Mass Storage Systems 
Patric Savage 
Shell Development 
PO Box 481 
Houston, TX 77001 
Off.: (713) 663-2384 
Res.: (713) 934-8063 
Compmail+ p. savage 

Mathematical Foundations of Computing 

Ashok Chandra 

IBM T.J. Watson Research Center 
Route 134 
PO Box 218 

Yorktown Heights, NY 10598 
Off.: (914) 945-1752 
Res.: (914) 739-2175 
Compmail+ a.chandra 


Microprocessors/Microcomputers 
Martin Freeman 
Stanford University 
Center for Integrated Systems 
Stanford, CA 94305 
Off.: (415)991-3591 
Res.: (415)493-5382 
Compmail + m. freeman 

Microprogramming 
Joseph Linn 

Institute for Defense Analysis 
Computer/Software Engineering 
1801 N. Beauregard St. 

Alexandria, VA 22311 
Off.: (703) 824-5500 
Res.: (703) 671-8527 
Compmail+ j.linn 

Multiple-Valued Logic 

Michel Israel 

IEE-CNAM 

18, Allee Jean-Rostand 

Evry Cedex, France 

Phone: (1) 6-077-9740 

Oceanic Engineering 
Michael Guberek 
Global Imaging 
201 Lomas Santa Fe Dr. 

Suite 360 

Solana Beach, CA 92075 
Compmail+ m.guberek 

Office Automation 

David Choy 

IBM Almaden Research Laboratory 
MS K52/803 
650 Harry Rd. 

San Jose, CA 95120 
Off.: (408) 927-1846 
Res.: (415)969-1811 
Compmail+ d.choy 

Operating Systems 
Joseph Boykin 
Encore Computer 
257 Cedar Hill St. 

Marlboro, MA 01752-3004 
Off.: (617) 460-0500 
Res.: (617) 845-1074 
Compmail+ j. boy kin 

Optical Processing 

Ravi Athale 
BDM 

7915 Jones Branch Dr. 

McLean.VA 22102 
Off.: (703) 848-7556 

Pattern Analysis and Machine Intelligence 

J.K. Aggarwal 
University of Texas 
Electrical Engineering Dept. 

Austin, TX 78712 
Off.: (512)471-3259 
Res.: (512)451-4697 
Compmail+ j.aggarwal 


Personal Computing 
Walter Beam 

Systems Engineering Dept. 


George Mason University 
Fairfax, VA 22030 
Off.: (703) 323-2782 
Res.: (703) 370-3431 
Compmail+ w.beam 

Real-Time Systems 
Andre van Tilborg 
2600 Riggs Way Parkway 
Minneapolis, MN 55413 

Robotics 
Mohan Trivedi 
University of Tennessee 
Ferris Hall 

Dept, of Electrical and Computer 
Engineering 
Knoxville, TN 37996 
Off.: (615)974-5450 
Res.: (615) 523-5853 
Compmail+ m. trivedi 

Security and Privacy 
Carl Landwehr 
Naval Research Laboratory 
Code 5593 

Washington, DC 20375 
Off.: (202)767-3381 

Simulation 

Heimo Adelsberger 
Vienna Business School 
Augusse, 2 

A-1090 Vienna, Austria 
Res. phone: (43)222-476-227 

Software Engineering 
Lorraine Duvall 
Duvall Computer Technologies 
PO Box 568 
Rome, NY 13440 
Off.: (315) 337-1564 
Res.: (315) 339-2592 
Compmail+ l.duvall 

Supercomputing 

Joanne Martin 

IBM 

12G-05 

44 South Broadway 
White Plains, NY 10601 
Off.: (914)789-7508 
Res.: (914) 686-6284 
Compmail+ j. martin 

Test Technology 
Paul Bardell 
IBM 

Dept. 256, Box 003 
PO Box 950 

Poughkeepsie, NY 12602 
Off.: (914)433-5400 
Compmail+ p.bardell 

Very Large Scale Integration 
Don Bouldin 
University of Tennessee 
420 Ferris Hall 

Dept, of Electrical and Computer 
Engineering 

Knoxville, TN 37996-2100 
Off.: (615) 974-5444 
Res.: (615) 966-8527 
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NEWS FROM THE COMMITTEE ON PUBLIC POLICY 


Classification of ‘sensitive’ information 
is once again in civilian hands 

Ralph J. Preiss, COPP Chair 


On January 8, 1988, President Rea¬ 
gan signed into law the Computer Secu¬ 
rity Act of 1987 (P.L. 100-235). According 
to the Congressional Record, the major 
provision of the act makes the National 
Bureau of Standards the primary agency 
of the government “responsible for sen¬ 
sitive civil sector computer matters.” 

The law provides for the development 
of standards and guidelines for the 
security of federal computer systems 
and establishes a training program (in 
computer data security) for all persons 
involved in federal computer systems. 

It states that the National Security 
Agency will continue to have authority 
over classified information in federal 
computers, and it directs the two agen¬ 
cies to work together to avoid duplica¬ 
tion of effort or the promulgation of 
conflicting guidelines in the process. 

The US Senate passed the act on 
December 21, 1987 based on HR 145, 
which the House of Representatives 
passed on June 22, 1987. 

Computer Society members will recall 
that, in the fall of 1984, President Rea¬ 
gan issued National Security Decision 
Directive (NSDD) 145 giving the US 
Department of Defense the authority to 
classify certain “sensitive” information 
in unclassified files. The objective was 

IEEE/USAB selects 
Fernbach for Citation 

Sidney Fernbach, consultant for Con¬ 
trol Data Corp. and active Computer 
Society volunteer, has been selected to 
receive the Citation of Honor for 1987 
presented by the IEEE’s United States 
Activities Board. 

The Committee on Communications 
and Information Policy (CCIP) nomi¬ 
nated Fernbach for his contributions on 
improving and maintaining the strength 
of US supercomputer capability. Fern¬ 
bach has led the IEEE supercomputer 
effort for the past six years, first as 
chair of the Ad Hoc Committee on 
Scientific Supercomputers, and then as 
chair of the CCIP Subcommittee on 
Scientific Supercomputers. 

Fernbach, Gerald W. Gordon, F. 

Karl Willenbrock, and James A. Wat¬ 
son will receive plaques at upcoming 
meetings of the USAB and the Profes¬ 
sional Activities Council for Engineers 
(PACE). 


to insure that this information was inac¬ 
cessible to non-US citizens. 

According to the New York Times, 
the administration’s main concern was 
that the broad range of technical and 
scientific information available in 
government and commercial computers 
would be readily available to our 
“adversaries” unless safeguards were 
taken. The Times reported that Donald 
Latham, an assistant secretary of the 
DoD, had testified in 1985 that “virtu¬ 
ally every aspect of government and pri¬ 
vate information is readily available” 

(in computer-readable and analyzable 
form)...and that “unfriendly govern¬ 
ments and international terrorist 
organizations are finding easy pick¬ 
ings” from the flood of unprotected 
information. 

One of the first victims of this direc¬ 
tive was the International Test Confer¬ 
ence, whose organizers were told in no 
uncertain terms that a number of papers 
had to be removed from the published 
proceedings because they contained 
what was considered to be sensitive 
information. 

COPP members have cooperated 
with the ITC, the DoD, and the IEEE 
USAB to resolve the published- 
proceedings matter. COPP has also 
helped formulate an IEEE position on 
sensitive information. The Computer 
Society’s Board of Governors didn’t 
accept the position but, instead, passed 
a resolution urging the president of the 
IEEE to inform the DoD that we do not 
recognize “sensitive” as a third and 
arbitrary classification. A compromise 
became necessary when the military 
showed “sensitive information” to be 
analagous to the way civilians label 
“company proprietary information,” 
except that NSDD 145 labels it as “US 
proprietary.” 

The best IEEE could do was to urge 
transferring the labeling responsibilities 
back to civilian control. Thus, the IEEE 
USAB’s Committee on Communica¬ 
tions and Information Policy (CCIP) 
worked with Congress (the chair testi¬ 
fied twice in 1987) and reviewed the 
definitions in HR 145 before Congress 
voted the measure into law. 

This action supersedes the President’s 
NSDD 145 and returns the control of 
information classification rules to civil¬ 
ian hands after three years under sole 
DoD discretion. 


Report examines 
recession’s effects on 
computer industry 

Concern about the effects of a possi¬ 
ble recession and a slowdown in capital 
spending on the computer industry is 
exaggerated, according to a forecast by 
Sanford C. Bernstein & Co. 

The company’s technology research 
group states that the correlation widely 
assumed to exist between total capital 
spending and computer industry 
revenues is almost nonexistent and that 
capital spending and computer revenues 
have moved in opposite directions in 
most years. 

The forecast cites product cycle as the 
major determinant of computer indus¬ 
try revenue growth. Strong product 
cycle years tend to show strong revenue 
growth regardless of economic condi¬ 
tions, and years between product cycles 
show slow growth even during eco¬ 
nomic expansion. 

The Bernstein technology group has 
also projected the impact of a revenue 
slowdown on individual computer com¬ 
panies to demonstrate that the effect on 
earnings would vary widely from com¬ 
pany to company. 

Overall, Bernstein & Co. sees the 
earnings of major companies growing 
at 25 to 50 percent in 1988 on the basis 
of major new product cycles. For com¬ 
panies between product cycles, the esti¬ 
mated growth rate is 15 to 25 percent. 

Article looks at means 
for verifying INF treaty 

The February issue of IEEE Spec¬ 
trum includes an article by Associate 
Editor John A. Adam in which he con¬ 
siders the technological means for 
verifying the Intermediate-Range 
Nuclear Forces Treaty signed late last 
year. 

The article, entitled “Verifying the 
New Arms Pact,” examines provisions 
of the INF agreement, possible enforce¬ 
ment problems, the effect of the agree¬ 
ment on this summer’s Strategic Arms 
Reduction Talks, the impact of Soviet 
visits to US weapons sites, and plans for 
destroying weaponry. 
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UPDATE 


Starting salaries in data processing at record highs 


Starting salaries for data processors 
have reached an all-time high in 1988, 
according to a study conducted by 
Robert Half International, data 
processing and financial recruiting 
specialists. The average increase over 
1987 is 4.2 percent. 

Some specific starting-salary ranges 


The US Strategic Defense Initiative 
Organization has awarded the contract 
for the National Test Bed to Martin 
Marietta Corp. The $500-million con¬ 
tract is for five years to develop a 
national network of supercomputer and 
simulation facilities designed to evalu¬ 
ate the feasibility of the Strategic 
Defense Initiative, commonly called 
“Star Wars.” With an estimated com¬ 
pletion cost of $1 billion, the project 
would be the largest single item in the 
SDI budget. 

Members of Computer Professionals 
for Social Responsibility, a nonprofit 
public interest organization of people in 
the computing field, have studied the 
project’s official request for proposals, 
as well as available news stories. The 
organization’s preliminary analysis 
resulted in the following conclusions: 

• The feasibility of SDI cannot be 
determined from computer simulation. 
Consequently, the National Test Bed 
will not establish whether SDI will work 
as intended. 

• The accuracy and adequacy of 
computer simulations depend on the 
validity of the model of events and sys¬ 
tem properties used. With SDI, there 
can be no validation of the model 
because too many relevant variables are 
unknown and will remain unknown. 

• Dept, of Defense officials have 
maintained that the US should continue 
to test nuclear warheads for assurance 
that they are reliable. These officials 
have repeatedly stated that computer 


are $37,000-$44,000 for project managers 
at medium-size installations, a 5.2-percent 
increase; $61,000-$81,000 for manage¬ 
ment information systems directors at 
large installations, a 1,4-percent 
increase; $20,000-$25,000 for program¬ 
mers at small installations, a 
5.9-percent increase; and $36,000-$44,000 


simulation of warhead design and per¬ 
formance is not an adequate substitute 
for actual explosive testing of real 
warheads. 

• Personnel managing the National 
Test Bed will become a built-in constitu¬ 
ency pressuring policymakers to aban¬ 
don the constraints of the ABM treaty. 
NTB managers will undoubtedly argue 
that their simulations would benefit 
from data from real space-based com¬ 
ponent testing, including testing that 
would violate the treaty. But even com¬ 
ponent testing linked to computer simu¬ 
lation will not assure us that SDI will 
work right the first time it is called into 
action as a complete system. 

• The Reagan administration has 
portrayed SDI as a research project. 

The National Test Bed is a development 
project. Rather than being limited to 
performing computer simulations of 
ballistic missile defense, it will be a pro¬ 
totype of an immense command-and- 
control system for battle management 
in case of nuclear war. 

• SDI cannot be made reliable, 
because it is impossible to test the sys¬ 
tem under realistic conditions of actual 
use; furthermore, the system’s perfor¬ 
mance characteristics cannot be known 
well enough in advance to specify the 
required detail and accuracy in com¬ 
puter software. 

In summary, the organization, based 
in Palo Alto, Calif., feels that “the 
National Test Bed is a waste of tax¬ 
payers’ money.” 


for systems analysts at large installa¬ 
tions, a 2.6-percent increase. 

These are national averages, and geo¬ 
graphic variances should be applied to 
all data processing starting salaries 
below $50,000, according to Robert 
Half International. 


Neural networks may 
someday detect forgers 

Joseph Goodman, professor of elec¬ 
trical engineering at Stanford Univer¬ 
sity, and Dorothy Mighell, a graduate 
student, are pioneering a neural net¬ 
work approach to the problem of credit 
card fraud. 

A neural network can learn by trial 
and error. During training, the com¬ 
puter is presented with different ver¬ 
sions of a signature, because a person’s 
handwriting changes with his or her 
mood. Once the correct set of weighted 
features is obtained, the computer will 
correctly accept or reject a novel version 
of the signature. 

Neural network processing is 
arranged in layers. As the computer 
scans the image of a signature, small 
bits of the image are mapped onto the 
neurons in the first layer. At each layer, 
or level, the computer makes a different 
judgment on the validity of the signa¬ 
ture. The signal is processed until it 
reaches the single processor in the final 
layer of the network, which is then on 
or off (accept or reject). 

Goodman and Mighell are currently 
simulating a neural network computer 
with software. If the approach works, 
they may build a machine with actual 
neural network structure. 

Other groups at Stanford are also 
working with neural networks, includ¬ 
ing several people in the psychology 
department. 


Questions raised about SDI National Test Bed 
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Supercomputer aids study of aircraft control phenomenon 


Using NASA’s new supercomputer 
system called the Numerical Aero¬ 
dynamics Simulation Facility, a researcher 
has made the most in-depth analysis to 
date of vortex breakdown, a complex 
phenomenon that can cause loss of lift 
and control for high-performance aircraft. 

A computer model developed by 
Kozo Fujii, a research fellow at 
NASA’s Ames Research Center, simu¬ 
lates the air flow field physics 
associated with vortex breakdown and 
provides new insights into its causes. 
Vortex breakdown is difficult to study 
experimentally and has remained poorly 
understood. 

The Ames computer model is the first 
computational analysis to predict spiral 
breakdown, thought to be the most 
common type of breakdown over air¬ 
craft wings. It can also predict bubble 
breakdown, the other major type. 

Fujii’s model of breakdown also is the 
first to use a real wing configuration 


Research roundtable 
appoints new members 

Five new members have been appointed 
to serve on the Council of the 
Government-University-Industry 
Research Roundtable, a discussion 
forum of scientists, engineers, adminis¬ 
trators, and policymakers sponsored by 
the National Academies of Sciences and 
Engineering and the Institute of Medicine. 

Newly appointed to three-year terms 
are Joel S. Birnbaum, vice president 
and general manager, Information 
Technology Group, Hewlett-Packard; 
Richard F. Celeste, governor of Ohio; 
Kenneth H. Keller, president, Univer¬ 
sity of Minnesota; John E. Sawyer, 
president emeritus, Andrew W. Mellon 
Foundation; and Alvin W. Trivelpiece, 
executive officer, American Assoc, for 
the Advancement of Science. The new 
members join 19 current members. 

The Research Roundtable, now in its 
fifth year, was created to foster discus¬ 
sion of crosscutting science and technol¬ 
ogy policies and problems by high-level 
representatives of government, universi¬ 
ties, and industry. Most of its work is 
conducted through working groups in 
special interest areas; the three working 
groups focus on science and engineering 
talent, university research and its 
management, and partnership and joint 
ventures between government, acade¬ 
mia, and private industry. 


rather than a simplified form. The 
model tracks vortex breakdown on 
strake delta wings such as those found 
on F-16 and F-18 aircraft. It can be 
adapted for use with a variety of wing 
configurations. 

The model is three dimensional, time 
accurate, and highly detailed, with the 
flow field calculated at 850,000 grid 
points. It is based on the Navier-Stokes 
equations, a highly complex set of equa¬ 
tions describing how fluids behave, and 
it requires 25 hours running time on an 
advanced supercomputer. 

Fujii is working to improve the 
accuracy of the model. At present, the 
model is valuable for analytical pur- 


The National Science Foundation has 
signed an agreement with the National 
Aeronautics and Space Administration 
to share high-speed communications 
lines. The agreement, which will ulti¬ 
mately link university researchers now 
connected to NSF’s national computer 
communications network to databases 
and supercomputers at NASA laborato¬ 
ries, is expected to save money by 
avoiding duplication of efforts by the 
two agencies. 

The agreement is in accord with a 
report recently released by the White 
House Office of Science and Technol¬ 
ogy Policy recommending improve¬ 
ments in networking to enhance US 
leadership and provide linkages needed 
for collaborative research by scientists 
working at different institutions. 

Three NASA facilities will be linked 


The British government has rejected 
proposals for a £1000-million, five-year 
research program that would have suc¬ 
ceeded the Alvey project currently near¬ 
ing completion. Instead, most research 
in information technology will be done 
through the second phase of the Euro¬ 
pean Community’s £1100-million Esprit 
program. The report in the British jour¬ 
nal Nature states that a modest national 
initiative has been proposed to comple¬ 
ment Esprit. 

Under the Alvey program, more than 
200 industrial projects received support. 
A committee had recommended con- 


poses but requires more resolution for 
design use. One key to achieving higher 
resolution results, Fujii believes, lies in 
using an even finer grid in the calcula¬ 
tions. However, a finer grid would 
require costly additional computer time 
and enormous computer memory— 
larger than that of existing supercom¬ 
puters. 

A solution is to use a zonal method, 
which increases the number of grid 
points in selected areas while keeping 
computer processing time from increas¬ 
ing. Using this technique, Fujii plans to 
extend the model to complex geome¬ 
tries, which would include the fuselage 
as well as wing surfaces. 


to existing NSF regional networks, 
which in turn are connected through a 
national backbone network. The God¬ 
dard Space Flight Center in Greenbelt, 
Ma., will be linked to the Southeastern 
Universities Research Associates Net 
(SURANet); the Ames Research Center 
in Mountain View, Calif., will be linked 
to the Bay Area Regional Research Net 
(BARRNet); and the Johnson Space 
Flight Center in Houston will be linked 
to SESQUINet, a regional network in 
Texas. 

Authorized scientists will be able to 
remotely access and use NASA data in 
their research and apply for time on 
NASA supercomputers. NASA-funded 
scientists at universities served by NSF 
regional networks will be able to com¬ 
municate and collaborate with col¬ 
leagues at the NASA centers. 


tinuation of the program, with a focus 
on the application of new technologies 
and an increased government contribu¬ 
tion of £425 million. 

The government’s response, contained 
in a white paper, indicates a shift 
toward more collaborative research and 
fewer “near-market” initiatives, which 
the government prefers to have industry 
pay for. 

Nevertheless, the Dept, of Trade and 
Industry plans to contribute £8 million 
over three years to a national program 
of research into high-temperature 
superconductivity. 


NSF and NASA to link computer networks 


Picture darkens for UK computer industry 
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25th flCM/l€€€ D€SIGN 
AUTOMATION CONF€R€NC€® 

JOIN THE 25th ANNIVERSARY CELEBRATION 

For 24 years the Design Automation Conference has been the 
meeting place for electronic CAD/CAM Engineers 

Always 1st in the latest CAD/CAM Technological Innovations 

NOW IN THE 25th ANNIVERSARY YEAR, DAC WILL OFFER: 

Over 130 papers, tutorials and workshops. 

Over 110 vendors of CAD/CAM hardware and software will 
(^5 exhibit their products. 

Over 50 vendor-technical presentations on Sunday afternoon 


© 


June 12th - Many announcing new products. 


IPPUCATIONS Anaheim Convention Center, Anaheim, CA 
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« AND 

ONC€PTS 


June 12-15, 1988 

Advance Registration ends May 14,1988 —j- — L 
For More Information, Call 1-303-530-4333 


HOTEL RESERVATION FORM 
h Design Automation Conference 
June 12-15, 1988 
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INDICATE HOTEL PREFERENCES ABOVE AND SEND COMPLETED FORM TO: 
















































NEW PRODUCT REVIEWS 


Editor: Richard Eckhouse, MOCO, Inc., PO Box A, 91 Surfside Rd., Scituate, MA 02055; Compmail+, r.eckhouse 


Alternative input devices 

Richard Eckhouse, New Product Reviews editor 


While most of us use the keyboard as 
a standard input device, we have many 
other alternatives, the most common 
being mice, light pens, trackballs, touch 
screens, and tablets. More esoteric (or 
eccentric) devices are also available. 

Of all the alternative devices, the 
mouse is probably the most popular. 
Low in cost and easy to use, the mouse 
requires relatively little desk space and 
fits naturally with screen-oriented appli¬ 
cations. To use a mouse, however, 
requires the ability to coordinate what 
the eye sees and the hand does. 

Try switching to a trackball after 
mastering a mouse. Some operations 
seem quite natural, although you may 
notice a tendency to want to move the 
whole device and not just roll the 
trackball. 

Some maneuvers can be really clumsy 
with the trackball, like click and drag 
operations. However, the better track¬ 
balls will usually include a cursor drag 
function that does not require holding 
down a button while moving the track¬ 
ball. Since this function is implemented 
by pressing a combination of buttons, 
most trackballs have at least three 
buttons. 

The trackball has another advantage: 
it doesn’t require the movement space 
that a mouse does. And, if the trackball 
emulates one of the more popular mice, 
it means that the trackball can directly 
and transparently replace the mouse. 
You could plug the trackball into a bus 
mouse interface card and use the stan¬ 
dard mouse software. Alternatively, if 
you plugged it into a serial port, you 
can use the existing mouse software 
because the trackball emulates the 
mouse. However, trackballs cost more 
than a mouse, typically because of the 
lesser volume sold and the smaller num¬ 
ber of vendors. 

Light pens and joysticks also have 
many of the attributes described, but 
for many reasons have not caught on. 
Probably the primary reason is that 
both devices require an input port 
different from the serial port that mice 
and trackballs use. Of course, software 
is another important consideration and, 


since most applications include only 
mouse or tablet drivers, the manufac¬ 
turers of these alternative input devices 
do not enjoy the widespread support 
that mice do. 

While touch screens are most noted 
for their appearance on information 
kiosks and in high-end systems (military 
and commercial), they generally cost 
two to ten times more than a mouse and 
yet have considerably less resolution. A 
typical 12- to 19-inch screen has 256 X 256 
addressable points (about 25 points per 
inch), whereas a mouse has a minimum 
of 200 points per inch. Also, a touch 
screen has no equivalent to the buttons 
found on mice, trackballs, and tablet 
pointers. 

While generally priced several times 
higher than a mouse, tablets are becom¬ 
ing real contenders in the PC market¬ 
place. Like the other devices mentioned, 
much of the popularity of this particu¬ 
lar input device seems to depend on 
several things: ease of use, price, and 
compatibility with an existing appli¬ 
cation. 

Tablets best suit applications like 
CAD/CAM, desktop publishing, 
graphic arts, and data entry by means 
of menu selection. Tablets work best 
with these applications for two primary 
reasons. First, in these applications the 
user must choose among a large number 
of alternatives. Second, the alternatives 
are arranged in a fixed format better 
suited to a tablet’s absolute positioning. 
For example, in the CAD environment 
this means descending down menus of 
items until you choose a specific option. 
Then, without stopping, you must 
immediately execute an option and per¬ 
form specific tasks such as drawing an 
object; dimensioning, transforming, or 
erasing an object; or selecting among 
the components that make up an object. 
The catch with a tablet is that it takes 
up a lot of desk space. 

A typical tablet consists of a flat 
desktop drawing surface ranging in size 
from 12x12 inches to 16 x 16 inches. 

Not all of the surface is active; the 
smaller work space occupies from 8x8 
to 12x12 inches. The technology used 


to implement the tablet is usually an 
electric or magnetic field activated when 
a pen or puck (I will refer to either as a 
pointer) comes within a certain prox¬ 
imity of the tablet’s active area. For 
most tablets, this translates to within 
half an inch above the surface. 

The tablet pointer usually comes 
equipped with one or several buttons. 

As you move the pointer, a stream of x- 
y coordinates is sent back to the com¬ 
puter via the serial port. The coordinates 
can be either relative or absolute. To 
integrate a tablet into a specific applica¬ 
tion, a software driver must be provided 
(just like for a mouse). 

Two features are quite valuable to 
have with a tablet: mouse compatibility 
(such as using the pointer as a mouse) 
and programmability of the device to 
allow instant changes as the user moves 
from one application to another. 

In this review we look at a new track¬ 
ball and two tablets. Each of these 
devices exhibits the features we’ve 
described while adding a few extra 
touches that make them unusual in their 
own right. Choosing which one is right 
for you will depend on the specific 
application and your preferences. 


MicroSpeed FastTrap 

When it comes to trackballs, the 
MicroSpeed FastTrap is in a class of its 
own. First of all, it’s the only device to 
include a z-axis trackwheel in addition 
to the normal x-y axes trackball. Sec¬ 
ond, it’s small (about 4 inches wide by 
7.5 inches long and 2-2.5 inches high), 
lightweight, and appears to have been 
designed to reside to the right (or left) 
of the normal PC keyboard. It’s made 
of durable plastic shaped to provide a 
comfortable feel during normal use. 
And it’s a three-button device that 
offers a number of interesting and use¬ 
ful options. 

Because it’s compatible with the 
Microsoft serial mouse, you can unplug 
that device, plug in FastTrap, and start 
using it immediately. Power is supplied 
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through the serial port. Everything 
works normally except, of course, that 
you don’t need a large working area to 
use it. In addition, the folks at MicroSpeed 
solved the click-and-drag problem, as 
well as the double-click problem, so 
that using their device is actually sim¬ 
pler than using a mouse. They did this 
by using the middle button as a drag 
on/off switch. If you press and hold 
either the right or left button, and then 
the middle button, you stay in drag 
mode. Press any button again and drag 
turns off. In addition, if you press and 
hold both left and right buttons, and 
then press the middle button, pressing 
the middle button from then on emu¬ 
lates pressing both left and right but¬ 
tons simultaneously. 

In actual use, the device is fun to use 
and much more convenient than a 
mouse. For those of us who need some 
time to get accustomed to a trackball in 
mouse mode, the z-axis trackwheel can 
be used to vary the trackball gain. As 
MicroSpeed points out, “Power users 
will probably like the more responsive 
fast gain setting while new users may 
initially be more comfortable learning 
to use the pointer on the slow gain set¬ 
ting.” To change from one mode to 
another, you simply rotate the track- 
wheel up or down one revolution. The 
gain settings are slow, normal, and fast, 
and correspond to 50, 100, and 200 
pulses per inch. 

So much for making this device every 
bit as easy to use as a mouse. In actual 
operation, you will generally find the 
trackball convenient when you need to 
accurately position the cursor and then 
take some action. A mouse typically 


tracks off the work space, requires you 
to lift and reposition the device, and 
then moves just as you release it or 
attempt to press a button. A trackball, 
on the other hand, allows you to pan 
over your application by rolling the 
trackball in the palm of your hand, then 
use your fingertips to precisely position 
it where you want it. Because the device 
sits in a stable position on your desk, it 
doesn’t move when you lift your hand 
or try to press a button. 

Besides supporting the Microsoft 
mouse standard with its set of compati¬ 
ble function calls, the FastTrap track¬ 
ball includes its own set of extended 
function calls. These function calls 
make it possible to take full advantage 
of this three-axes pointing device. How¬ 
ever, the newness of the device means 
that there are few, if any, applications 
that use this FastTrap feature. In their 
well-done manual, the MicroSpeed 
designers devote a full chapter to future 
applications in such fields as desktop 
publishing, word processing, CAD/CAE, 
spreadsheets, databases, and games. A 
number of programs come with the sys¬ 
tem, including both MAP.SYS and 
MAP.COM, the MicroSpeed device 
drivers. While you don’t need to use 
these drivers for normal mouse emula¬ 
tion, you will need them when using 
KeyMap, a keyboard emulator 
program. 

KeyMap is a TSR program that 
allows you to make full use of FastTrap 
with applications that do not currently 
support a pointing device. A number of 
templates already included with Key- 
Map translate FastTrap operations into 
equivalent keyboard operations. For 


example, using the predefined DOS 
template, you can move the trackball in 
the x-y plane and get the same result as 
pressing the arrow keys. Moving the 
trackwheel acts the same as pressing the 
page-up or page-down keys. Pressing 
one of the three buttons is equivalent to 
hitting the Return, Esc, or F3 keys, 
respectively. In combination with the 
shift, Alt, and Control keyboard keys, 
the buttons can take on up to 12 differ¬ 
ent keycodes. 

I tried using KeyMap with the Norton 
Editor, DOS, PCTools, and Leading 
Edge Word Processor and really liked 
the results. Except for Leading Edge, 
which takes control of the keyboard 
interrupt and therefore did not work, 
each of the other applications was much 
easier to use under KeyMap. Rapid 
scanning of directory lists, menu selec¬ 
tion, and keyboard responses were easy 
to accomplish using only the trackball. 
Part of the reason for this is that you 
can place your hand on the FastTrap 
device without really looking at it, and 
then, with eyes directed at the screen, 
make your choice. 

Up to 32 templates can be stored with 
each activation of KeyMap. MicroSpeed 
includes several for such programs as 
Lotus 1-2-3, WordPerfect, WordStar, 
dBase, and Turbo Pascal. You can cus¬ 
tomize each of these templates as well 
as create your own using the KeyMap 
edit function. You can also disable Key- 
Map for applications requiring mouse 
emulation. 

MicroSpeed just recently began ship¬ 
ping its own driver for AutoCAD. With 
this new driver, you can select menu 
items directly, using the trackwheel in 
combination with the middle button. 
The result is that you no longer need to 
move the cursor out of the active draw¬ 
ing area and into the menu area to 
select a menu item. 

I’m sold on this trackball and find 
that I prefer it to my mouse. It’s com¬ 
pletely compatible with DESQview and 
the applications running under that 
window manager. It’s also compatible 
with all the other applications I use, 
including MS-Word and CADKEY. At 
$149, it’s slightly more expensive than a 
mouse, but it surely packs in more 
features. 
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Summagraphics Bit Pad Plus 

Summagraphics is probably the best 
known supplier of digitizing tablets. 

The company has been a leader in 
developing low-priced digitizers, and 
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tainly, the single-board design and 
fewer parts should make this a low- 
maintenance, high-reliability device. At 
its current price of $495, which includes 
both the stylus and four-button puck, 
the tablet is certainly a good value. I 
would expect this price to drop in the 
near future, in part because of the com¬ 
petitiveness of this marketplace and the 
economies that come as production 
ramps up. 

In summary, the Bit Pad Plus is a 
well-made, no-frills tablet. If you need 
a low-cost device, supported by nearly 
every application, that you can take out 
of the box and use immediately, then 
this is the tablet to buy. 
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Summagraphics reportedly designed its Bit Pad Plus to be a low-cost, full-sized 

graphics tablet with precise drawing and tracing capabilities. Kurta IS/ONE 


the SummaSketch is often taken as the 
standard against which to measure 
other tablets. It is also generally the 
first tablet supported by new drawing or 
CAD/CAE packages. In a very true 
sense, Summagraphics has become its 
own competition. 

In designing a new tablet, Summa¬ 
graphics wanted it to be full-sized with 
precise drawing and tracing capabilities. 
Lower cost was an important goal as 
well. There is no question that they met 
all of these goals in the Bit Pad Plus. 
Equally important was ease of use, 
which translates into a “plug and play” 
integration with existing applications. 

When it comes to installing and using 
a tablet, the new Summagraphics Bit 
Pad Plus is a snap. All you have to do is 
plug either the stylus or four-button 
puck into the tablet, connect both the 
tablet and the power supply to the 
adapter plug using the cables supplied, 
and plug the adapter into your PC using 
its 25-pin serial port. All of this is 
explained in great detail in the illus¬ 
trated user’s manual. However, it’s 
darned near impossible to make a mis¬ 
take hooking things up, so looking at 
the manual would be the last thing most 
of us would do. 

When you turn on the power to the 
tablet, it immediately calibrates itself. 
During calibration, the power light 
blinks to indicate what’s going on. 

From then on, the light serves as a 
proximity indicator, steady on when the 
pointer is in range and blinking when¬ 
ever the pointer is out of proximity. 

This useful feature is complemented by 
the beeping of the tablet whenever a 


button on the puck or the tip of the sty¬ 
lus is pressed. I think every tablet 
should have these features and I hope 
other manufacturers take note of it. 

Using the Bit Pad Plus, once physi¬ 
cally connected, is simply a matter of 
selecting it from the list of devices sup¬ 
ported by the applications software for 
which it will be used. No software is 
supplied with the tablet, so, if you want 
to use it with an application that 
doesn’t support it, you will have to 
write your own driver. While Appendix 
A of the user’s guide does specify the 
baud rate and data format, you will 
need to order some additional manuals 
from Summagraphics if you choose to 
write your own driver. 

The resolution of the Bit Pad Plus is 
up to 254 lines per inch and the data 
rate is fixed at 9600 baud. When used 
with the CAD system from CADKEY, 
the tablet worked flawlessly. Menu 
selection and cursor movement were 
smooth and accurate. The button click 
was loud enough to be heard above the 
noise of the computer and distinctive 
from the keyboard click. The nonmark¬ 
ing stylus was easy to hold and glided 
smoothly across the 12-by-12-inch 
active drawing area. Overall size is 
approximately 16 inches wide by 17 
inches tall, with a slight slope front to 
back. The tablet is light enough to hold 
on your lap, and is completely enclosed 
in a high-impact molded plastic case. 

Summagraphics uses Charge Ratio as 
its technology in the new Bit Pad Plus. 
The company claims that this technol¬ 
ogy offers the lowest cost and highest 
reliability in a digitizing tablet. Cer- 


While at first glance the Kurta IS/ONE 
may look like an ordinary tablet, it is 
anything but. A close look reveals a 
series of soft switches across the top of 
this device, while along the back panel 
are three dip switches, offering a multi¬ 
tude of tablet settings. There is no 
external power supply. Also, the eight- 
degree slope of the tablet is noticeably 
steeper than found on other similar 
tablets. 

Common to the IS/ONE and other 
tablets is the molded plastic case, choice 
of stylus or four-button puck, and 
12-inch-square drawing surface. The 
differences start, however, with its 
three-button, cord-free stylus, as well as 
a cord-free model of the four-button 
puck. 

The most significant features that dis¬ 
tinguish this tablet from the others are 
its programmability and extensive use 
of soft keys. To understand this, we 
need to first examine the dip switches in 
the back of the unit that make this a 
multiple personality tablet. 

The first group of switches deter¬ 
mines the mode and hence how the tablet 
sends data to the computer. Normally 
the tablet is in auto mode, which means 
that data is continuously output when 
the pointer is within proximity of the 
tablet surface. Five other modes are 
also possible. 

The next group of switches deter¬ 
mines the menu located at the top of the 
tablet. The last group within the first 
dip switch selects the baud rate, ranging 
from 300 baud to 19.2 kilobaud, with 
auto speed recognition possible. In the 
second dip switch, the data format 
(including encoding, number of data 
bits, and parity) and rate (conversions 
per second) are specified. The last dip 
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switch sets the resolution (200 to 1000 
points per inch in English or metric 
units), CTS signal recognition, and 
emulation mode (for prior Kurta tablets 
or even tablets bigger or smaller than 
the one currently being used). 

How these dip switches are set deter¬ 
mines the default setting for the tablet 
on power up. If you change from one 
application to another and need to use a 
different set of settings, it is not neces¬ 
sary to turn off the unit and start all 
over. First, if you want one of the 
default settings stored within the 16-bit 
CMOS processor built into the tablet, 
you merely have to touch one of the 
soft keys labelled PI to P5 to instantly 
make the change. For example, one of 
these soft keys makes the tablet look 
like a Microsoft mouse; another makes 
it look like a Summagraphics MM series 
tablet. If the preprogrammed soft keys 
don’t offer what you want, you can 
download any one of them specifying 
every possible option. 

Even more custom tailoring is possi¬ 
ble. In addition to the configuration 
keys, there are pen function keys that 
work with the dual action stylus. The 
“soft” setting sets up the pointer for 
drawing, while the “snap” aids menu 
selection. If you desire, you can even 
enable both. 

Finally, we come to the function 
keys. These 13 soft keys can be pro¬ 
grammed to transmit any series of key¬ 
strokes you want. Thus, you can tailor 
the keys for a particular application. In 
the case of CADKEY, I used this capa¬ 
bility to load the soft function keys with 


CADKEY’s immediate commands (Alt- 
A for auto scale or Ctrl-X for cursor 
snap). By doing this, I could avoid 
going to the keyboard for frequently 
executed commands. As you can 
imagine, this really made me more 
productive, since I needed less access to 
the keyboard. 

With all this functionality, you’d 
think IS/ONE would be a very expen¬ 
sive tablet. It’s not. Depending on tab¬ 
let size, configuration, and options, 
prices range from $565 to $1145. In my 
case, the price for the tablet was $495, 
plus $50 for the IS/ONE AT kit, $100 
for the four-button puck, and $150 for 
the cord-free pen. 

Besides this hardware, you get a dis¬ 
kette containing a series of drivers and a 
configuration program. The configura¬ 
tion software allows you to define 
whether the tablet is used in relative or 
absolute mode, the equivalent key¬ 
strokes for the function keys, the active 
area, and the communications port it’s 
connected to. Gem and Ventura instal¬ 
lation instructions are also included so 
that you can set the tablet to work with 
these systems. Other software includes 
an interface to Microsoft Windows, a 
software development kit, and tem¬ 
plates for Aldus PageMaker, 
AutoCAD, and VersaCAD. 

One thing about the Kurta tablet 
could use improvement. The manuals 
read like engineering specifications 
rather than user’s guides. While all the 
necessary information is packed into 
the manuals supplied with the tablet, 
they are very hard to read. It must have 


Entry-level networking system 

Dick Eckhouse, New Product Reviews editor 


A promising new product from 
Trans-M Corp. of Medfield, Mass., 
offers low-cost local area network con¬ 
nections between IBM PCs and compat¬ 
ibles. Using CSMA/CD and a proprietary 
protocol, the Net-127 system links up to 
127 systems at 250 kilobits in an easy- 
to-use and easy-to-install LAN. The 
hardware consists of a short board and 
25 feet of telephone cable with RJ-11 
jacks on each end. The software comes 
on a floppy and includes an installation 
procedure. 

I installed the boards in several Com¬ 
paq computers, following the installa¬ 
tion procedure to assign a station ID 
and number of remote drives. I fixed up 
the CONFIG.SYS and AUTO¬ 
EXEC.BAT files to include this new 
device. I ran the NCONFIG program to 
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assign device letters, access level, and 
drive assignments as required. Finally, I 
assigned the printers and the single 
COM port, again indicating accessibil¬ 
ity. I did all of this with only a cursory 
glance at the manuals, attesting to the 
ease-of-use of this product. The 
manuals, while brief, do cover all of the 
installation and use in detail, and 
include a lot of troubleshooting infor¬ 
mation for those who may need it. 

I tried out the network in a couple of 
typical LAN applications. In the first 
case, I used one node, with several hard 
disks, as a file server to another node. 
Things worked well, although the time 
it took to remotely load a large EXE 
file was noticeably longer than if the file 
had been loaded locally. Obviously, the 
speed of the network was the limiting 


taken me at least three readings in some 
cases to make sure I understood exactly 
what to do or what features and func¬ 
tions were available for use. 

IS/ONE is an alternative input device 
for the tablet aficionado. For a little bit 
more money than a standard tablet 
costs, you get programmability and ver¬ 
satility, and a device that can work in 
practically any environment requiring 
either a mouse or a tablet. This versatil¬ 
ity alone can justify the additional cost. 
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Recommendations 

All of the devices discussed here are 
excellent alternatives to the standard 
keyboard. People unaccustomed to the 
hand-eye coordination that a trackball 
or tablet requires will need some “train¬ 
ing” time to master the art of using 
these new devices. Confirmed mouse 
users should be able to make the adjust¬ 
ment quickly. 

I chose to look at three devices rang¬ 
ing in price from $149 to $1145. Each 
offers price-performance advantages 
that depend on the user’s needs and the 
application environment. Thus, it’s dif¬ 
ficult to recommend one over the other. 
Suffice it to say that no matter which 
one you choose, you will enjoy these 
devices and they will make your interac¬ 
tion with your application more efficient. 


factor. What this case did illustrate was 
the feasibility of the approach and that 
a low cost system can be very func¬ 
tional. 

My second test was more typical: I 
used the net as a means of remotely 
accessing data. Here, performance was 
excellent, and the application worked 
exactly as it would have if the data had 
been stored locally. 

In my last test, I used the net to 
access a remote printer. Because the 
speed of the network was faster than 
the bandwidth of my dot matrix printer, 
there was virtually no penalty for the 
remote access. The bottom line is that 
everything worked well, completely 
transparent to the application and to 
me, the user. Of course, I did have to 
give up 32 kilobytes of RAM on each 





The Net-127 from Trans-M connects 
IBM PCs and compatibles, using 
CSMA/CD and a proprietary protocol. 
Net-127 consists of software and an 
adapter card, shown here. 


machine for the memory-resident 
Net-127 software, but this seemed a 
rather modest sacrifice. 

I did have some minor problems that 
may have been unique to my configura¬ 
tion. On my XT Compaq Deskpro, I 
couldn’t find an interrupt and port 
address that wasn’t taken up by the 


several additional boards, such as a bus 
mouse and second hard card I had 
previously installed. On my Compaq 
286 Deskpro, I ran into a conflict with 
the Intel Above Board. In both cases, 
Trans-M technical support was most 
helpful, thanked me for the informa¬ 
tion I provided, and gave me several 
options to work around the problems. 
One solution would have been to 
modify their boards, something I chose 
not to do. The other was to simply pull 
out the offending boards during my 
tests; this was a simple way to get going 
quickly and with little hassle (and was 
my obvious choice). This should not be 
a problem in the future, I was 
informed, because the next release of 
the system will solve both of these 
problems. 

I really would have liked to have kept 
this system installed. In a small shop 
like ours, linking machines sure beats 
the current method of moving floppies 
from machine to machine, and avoids 
the double density-high density floppy 
problems when exchanging diskettes 
between AT and XT class machines. 


The savings in time, and the obvious 
savings in not having printers attached 
to each machine, would more than 
justify the price of $249 per station. 
However, there are two reasons that I 
will have to wait. First, the hardware 
conflicts in the two Deskpros have to be 
resolved. In all fairness to Trans-M, this 
is not entirely their doing, but is typical 
of the “open architecture” lack of stan¬ 
dardization in the PC. Second, as the 
Trans-M board stands, I had to give up 
either COM1 or COM2 for use by the 
Net-127 system. Since I often have 
devices attached to both ports, this 
presents a problem. As I’ve already 
said, Trans-M plans to change things in 
the next version of its hardware and 
software, thereby allowing a wider 
selection of interrupt and port addresses. 

I hope to be able to give an updated 
report in a future column. This is an 
exciting and low-cost product that 
should and could be a part of every PC 
system where there is an obvious need 
to share devices. 
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Another visit with the Living C interpreter 

Daniel McAuliffe, McDonnell Douglas Astronautics Co. 


I first reviewed Living C Personal for 
the March 1986 issue of IEEE Software. 
Since that time, the product has been 
renamed Living C Plus, a number of 
changes have been incorporated, and 
new features have been added. 

The most visible change is the addi¬ 
tion of a new menu and windowing sys¬ 
tem. Pull-down menus let you access 
program functions, with user-definable 
shortcut keys available for the more 
commonly used options. All text and 
messages in Living C are viewed 
through windows. You can move and 
resize any window and also control the 
window color scheme. The new menu 
and windowing system resembles that 
found in other products in the micro¬ 
computer marketplace. It is easy to use 
and a great improvement on the previ¬ 
ous interface. 

The program editor has received a 
major overhaul. You can now edit up to 
12 files of any size at the same time. 

Each file is assigned to a separate user- 
configured window. Data is moved 
between files with a cut and paste 
buffer. The editor also includes a 
powerful undo feature. You can erase 
as many as 35 previous actions. 

Another nice feature is the auto-save 
function. You can specify an elapsed 
time, in minutes, after which the editor 
will automatically save to disk any files 


that have been modified. This includes 
any file in the list of files you may cur¬ 
rently be editing. 

The number of library functions 
provided with the new version of Living 
C has been increased from approxi¬ 
mately 50 to more than 140. All func¬ 
tions conform to the proposed ANSI 
standard. 

The help facilities have been 
expanded from the previous version. 
They are excellent in every respect. 
Information on program compilation 
errors often includes examples of the 
types of usage that may have generated 
the error—a great improvement over 
the cryptic messages generated by many 
compilers. 

The documentation supplied with 
Living C is much improved over the 
previous version, although it still con¬ 
tains an annoying number of small 
errors, such as discrepancies between 
figures and text. It could easily tolerate 
another round of proofreading. 

This new version still requires an IBM 
or 100-percent-compatible PC with 
DOS 2.0 or higher to run. However, the 
amount of memory needed is now 512 
kilobytes. 

The most outstanding features of 
Living C are still the debugging and 
tracing facilities. The animation mode 
is valuable not only for finding obscure 


program errors, but for insight into the 
way C expressions are evaluated. 

In addition to animation mode, Liv¬ 
ing C includes support for breakpoints, 
single stepping through expressions 
using a single keystroke, and monitor¬ 
ing and changing the values of program 
variables. These functions were present 
in the previous version, but ease of use 
has improved considerably with the new 
menu and window system. 

About the time I received the current 
version of Living C for review, I also 
received a copy of the Microsoft Quick 
C compiler. This made it very difficult 
to evaluate Living C without some com¬ 
parison with Quick C. Although the 
Living C trace and debug features hold 
up well against the Quick C debugging 
features, it has a difficult time compet¬ 
ing with Quick C when it comes to addi¬ 
tional features and price. Quick C 
offers a superior development environ¬ 
ment, program make facilities, exten¬ 
sive graphics functions, and an upgrade 
path to the more powerful Microsoft 
V5.0 compiler. Quick C also sells for 
$100 less than Living C. 

The new version of Living C is much 
improved over the previous version and 
offers a number of interesting features. 
However, at $200 it may be too little for 
the additional cost. 
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NEW PRODUCTS 


Delta FPP card helps PC-AT handle neural nets 


The Technology Research Group of. 
Science Applications International 
Corp. has announced the Delta Floating 
Point Processor accelerator card for the 
IBM PC-AT. According to SAIC, the 
card, when combined with window soft¬ 
ware, allows users of the company’s 
Artificial Neural Systems Simulation . 
Software (ANSim) to build neural net¬ 
works on the PC-AT. 

The Delta FPP reputedly achieves 
peak speeds of 22 Mflops. It comes with 


12M bytes of memory and supports 32- 
and 64-bit floating-point and integers. 
The card costs $14,950. 

Software for the Delta FPP includes 
13 user-configurable neural models in 
ANS simulation, provided in the com¬ 
pany’s ANSkit. The 13 paradigms 
included in ANSkit are written in C 
under Microsoft Windows. (See New 
Products in the January issue of Com¬ 
puter, p. 93, for a description of 
ANSkit.) ANSkit costs $995. 


Also available for the board are a C 
compiler, an assembler, and an actor- 
based language called ANSpec. 

Technology Transfer will sponsor 
“Neural Networks for Artificial Intelli¬ 
gence” to be held in Washington, D.C., 
March 21-23. Geoffrey Hinton of the 
University of Toronto will lecture dur¬ 
ing the three-day course, which costs 
$895. Participants will manipulate the 
neural models provided in SAIC’s 
ANSim software. Telephone (213) 
394-8305 for more details. 
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Anza-Plus turns a PC-AT into a neurocomputer 


Hecht-Nielsen Neurocomputers 
offers the Anza-Plus Neurocomputing 
Coprocessor System, which reputedly 
transforms an IBM PC-AT or 80386- 
compatible into a neurocomputer capa¬ 
ble of real-time processing. The board 
targets both applications development 
and delivery. 

The company claims that the Anza- 
Plus can implement neural networks 
with up to 2.5 million processing ele¬ 
ments and interconnections, and can 
update the network at a peak rate of 10 
million interconnects per second or at a 
sustained rate of 6 million IPS in the 
feed-forward mode, 1.5 million IPS in 
learning mode (back propagation). 

The board comes with Release 2.0 of 
the company’s Neurosoft software,, 
which reportedly enables users to treat 
neural networks as subroutines within C 
programs. The software includes the 
User Interface Subroutine Library and 
five neural net packages. All neurocom¬ 
puting functions are executed within the 
on-board memory of the Anza-Plus, 
leaving the host free for I/O functions 
and preprocessing. 

The coprocessor board comes in two 
versions, one with 10M bytes of mem¬ 
ory and one with 2M bytes of memory. 
The 2M-byte model is limited to imple¬ 
menting neural networks with a com¬ 
bined total of 500,000 processing 
elements and interconnects. 

The lOM-byte model costs $14,900. It 
also comes bundled in a Zenith Model 
248 PC for $19,900 or in a Zenith 
386-80 PC for $24,900. The 2M-byte 


model costs $3900 less in all of these 
configurations. 

The Zenith 248 host PC comes with a 
10-MHz 80286 processor, 40M-byte 
hard disk, 1M byte of extended mem¬ 
ory, and an 80287 coprocessor. The 
Zenith 386-80 host PC comes with a 
16-MHz 80386 processor, 1M byte of 
RAM, and an 80M-byte hard disk with 
a 25-ms access time. Both host PCs 
include a 1,2M-byte floppy and an EGA 
color adapter with 13-inch color monitor. 

The Anza-Plus is based on the Weitek 
Accel family of processors, which 


incorporates reduced instruction set 
computer technology. Processor fea¬ 
tures include a 32-bit floating-point 
processor capable of 20 million 
floating-point operations per second. 
The board has a peak power dissipation 
of 15 watts. 

Hecht-Nielsen Neurocomputers has 
reduced the price of the original Anza 
coprocessor to $6900, including Neu¬ 
rosoft software. 

Anza-Plus: Reader Service 31 
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The Anza-Plus from Hecht-Nielsen Neurocomputers is a single-board coprocessor 
for implementing neural networks on an IBM PC-AT or 386 compatible. 
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Nestor Development System emulates human learning 


Nestor, Inc. has announced the Nes¬ 
tor Development System, a neural- 
network, artificial-intelligence-based 
product which the company says emu¬ 
lates aspects of the human thinking pro¬ 
cess and learns automatically from its 
own experience. The system reportedly 
eliminates the task of knowledge 
engineering and rule programming on 
which AI expert systems are based. 

The company claims that NDS sim¬ 
plifies the creation of neural architec¬ 
tures, and that its performance in 
learning and recognizing new patterns is 
independent of the number of patterns 
previously learned. The software fea¬ 
tures the proprietary Nestor Learning 
System (NLS) multimodule neural net¬ 
work architecture, recognition packages 
for a variety of pattern recognition 
applications, diagnostics for system 
configuration and development of new 
coding methods, examples of complete 
recognition packages, and documenta¬ 
tion on how to apply the NLS recogni¬ 
tion technology. 

Nestor describes the software’s com¬ 
ponents as tools to enable you to train 
new NLS memories, tabulate recogni¬ 
tion performance statistics, determine 
which training patterns are leading to 
erroneous results, and evaluate the need 
for revising your feature-extraction 
methods. Using NDS reportedly involves 
seven steps: 

(1) Selecting the recognition problem 
and collecting sample data 

(2) Developing feature-extraction and 
encoding methods 

(3) Defining the NLS system archi¬ 
tecture 

(4) Training and testing the NLS 
recognizer with sample data 

(5) Evaluating recognition results 


(6) Revising the feature-extraction 
methods and system architecture to 
improve recognition results 

(7) Merging the recognition package 
with an application program 

NDS code is written in the C lan¬ 
guage. Portions of the training program 
and code explained by the tutorial are 
supplied as source code. 

The system features 15 million con¬ 
nections, 150,000 processing elements, 
500,000 connections per second, and 
dynamically allocated connectivity, 
according to the company. 

The NLS-callable routines require 
50K bytes of memory. NLS memory 
requirements are application dependent. 

NDS runs on Sun or Apollo worksta¬ 
tions or the IBM PC-AT. It requires a 
1.2M-byte disk drive, 512K bytes of 
memory, MS-DOS 3.0 or higher, 
Microsoft C version 4.0 or higher, and 
a lOM-byte hard drive. 

The total cost of NDS on the Sun and 
Apollo workstations is $25,000. This 
breaks down into $9000 for a two-week 
training period and $16,000 for the first 
copy of the software. Additional copies 
for inclusion in applications cost $10,000 
or less, depending on quantity. The 
training concentrates on background 
information, followed by development 
of the customer’s specific application. 
The IBM PC-AT version costs $8,500 
for the software, for a total of $17,500 
with training. Additional copies cost 
$5000 or less, depending on quantity. 

NDS was developed by Leon Cooper 
and Charles Elbaum, with further 
development proceeding in cooperation 
with Douglas Reilly and Christopher 
Scofield of Nestor. 
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Concurrent 3280SP enters superminicomputer market 


Concurrent Computer Corp. offers 
the Model 3280SP uniprocessor super¬ 
minicomputer. The system comes in a 
71-inch cabinet with up to 1G byte of 
disk storage, up to 32M bytes of mem¬ 
ory, and up to 20M-bytes per second of 
data throughput, according to the 
company. 

A base system features disk and mag¬ 
netic tape subsystems with a 4M-byte 
memory capacity for $199,500. The 
company claims a processing perfor¬ 
mance for the Model 3280SP of 6.4 mil¬ 
lion single-precision Whetstone 
instructions per second. 


The 3280SP operates on Concurrent’s 
proprietary operating system, OS/32, 
and the Xelos operating system, a port 
of Unix System V. 

Other features include a 32-bit 
processor with 16K-byte cache memory, 
64-bit floating-point processor, and 8K- 
bit words of writable control store; a 
64M-byte-per-second memory bus; an 
HPD-368F disk; and a lOM-byte-per- 
second direct memory interface, 
expandable to two interfaces with an 
aggregate of 20M bytes per second. 
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Prime offers MXCL 5 

Prime Computer has introduced the 
MXCL 5 departmental supercomputer 
for numerically intensive computing. 

The new minisupercomputer comes 
from a partnership with Cydrome Inc., 
which designed the system’s Directed 
Dataflow architecture. 

The MXCL 5 reportedly provides an 
interactive environment based on 
AT&T’s Unix System V.3 and incor¬ 
porates a number of standards: IEEE 
754 floating-point format, VME bus, 
Fortran 77, and IEEE 802.3 Ethernet 
and TCP/IP protocols. It operates as a 
dedicated floating-point computation 
engine, or integrated into a variety of 
networked environments as a computa¬ 
tion server for workstations, super¬ 
minicomputers, or mainframe systems, 
according to the company. 

Features include a 100K, ECL-based, 
64-bit numeric processor with a 40-ns 
clock speed; up to six 32-bit interactive 
processors with a 16.67-MHz clock 
speed, for noncomputational functions; 
a lOOM-byte-per-second system bus; an 
enhanced Fortran 77 compiler; up to 
256M bytes of very high bandwidth 
main memory; up to 64M bytes of sup¬ 
port memory; an aggregate 400M-byte- 
per-second bandwidth for disk access; 
and I/O performed independently of 
the numeric processor. 

According to Prime, the MXCL 5 
was designed with eight functional units 
working in parallel to balance the com¬ 
puter’s processing performance. Each 
functional unit is specialized to execute 
a specific type of instruction, such as 
floating-point addition, memory access, 
and address arithmetic. 

The standard MXCL 5 system 
includes a numeric processor, an inter¬ 
active processor, a service processor 
and personal computer console, an I/O 
processor, a VME bus, an I/O central 
cabinet, an I/O expansion cabinet, a 
disk drive and controller, a tape drive 
and controller, the MX/IX Unix oper¬ 
ating system licensed for an unlimited 
number of users, a program debugger, a 
korn shell, and libraries. 

The two base configurations available 
differ only in the amount of memory 
offered. The entry-level MX1201 model 
with 8M bytes each of main and sup¬ 
port memory and associated peripherals 
costs $579,000. The MX1401 with 64M 
bytes of main memory and 16M bytes 
of support memory and associated 
peripherals costs $774,000. 

System installation for the MX1201 
and MX1401 costs $5000 and $7000, 
respectively. 
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IBM Controller speeds 
communications 


IBM has announced the 3745 com¬ 
munication controller, featuring what 
the company calls its newest and densest 
logic chips. The controller reportedly 
allows users to channel-attach to a max¬ 
imum of 16 System/370 computers, 
plug in line-interface couplers without 
shutting off the power, and diagnose 
and repair components without disrupt¬ 
ing system operations. 

The IBM 3745 comes with the option 
of one or two internal central control 
units (Models 210 and 410, respec¬ 
tively). With two units, one can be shut 
down for upgrading or servicing while 
the other continues to function. Each 
CCU can be configured with 4M bytes 
or 8M bytes of memory. Cache memory 
provides 16K bytes of memory for each 
CCU. 

The controller incorporates three 
application-specific integrated circuitry 
chips from IBM’s new family of chips, 
each chip holding up to 40,000 circuits. 

It consists of four subsystems: control, 
communication, maintenance and oper¬ 
ator, and power. 

The IBM 3745 Model 210 includes 
one CCU, a 45M-byte hard disk, two 
low-speed communication scanners, 
eight line-interface couplers, 4M bytes 
of memory, Remote Support Facility 
modem, and line cables. 

Contact the company for more infor¬ 
mation about pricing and availability. 
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Entry-level models join 
Unisys 1100/90 mainframes 

Unisys has added two new models to 
its 1100/90 family of mainframe com¬ 
puters. The 1100/91 and 1100/92 
Model II SV processors reportedly fea¬ 
ture object-code compatibility with cur¬ 
rent 1100 Series and 2200/200 systems. 

The single-processor 1100/91 Model 
II SV uses 256K RAM chips and comes 
in 8M-byte or 16M-byte memory units. 
Including operator console and system 
control software, it costs $1,429,000. 

The dual-processor 1100/92 Model II 
SV costs $2,605,000. It features a maxi¬ 
mum system memory of 32M bytes. 
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The IBM 3745 communication controller incorporates logic chips containing up to 
40,000 circuits per chip. The Model 210 provides one CCU, while the Model 410 
provides two CCUs. 


Alliant’s new 2nd-generation minisupers based on ACE 


Alliant Computer Systems has 
announced the second-generation 
FX/40 and FX/80 minisupercomputers. 
The new systems are based on the com¬ 
pany’s Advanced Computational Ele¬ 
ment, a general-purpose, 64-bit vector 
processor. They replace the FX/8. 

The top-of-the-line FX/80 features 
up to eight vector processors, with a 
peak rating of 188.8 Mflops. A base 
configuration FX/80 costs $299,000. It 
includes two ACEs, two VME-based 
Interactive Processors (IPs), 32M bytes 
of main memory expandable to 256M 
bytes, a 16-line multiplexer, 1.1G bytes 
of disk storage, and a magnetic tape 
drive. Up to six ACEs and 10 IPs can be 
added. 

The FX/40 incorporates up to four 
processors, with a peak rating of 94.4 
Mflops. A base configuration FX/40 
costs $149,000. It includes one ACE, 
one VME-based IP, 32M bytes of main 
memory expandable to 160M bytes, 

1 • 1G bytes of disk storage, and a car¬ 
tridge tape drive. Up to three ACEs and 
five IPs can be added. 

All systems include an Ethernet con¬ 
troller, VME I/O chassis, console video 
terminal, dot matrix printer, and oper¬ 
ating system software along with 
FX/Fortran and the Alliant Scientific 
Libraries. 

The new systems reportedly offer up 
to twice the performance of Alliant’s 


earlier FX/Series systems, yet are com¬ 
patible with those machines and the 
available applications software pack¬ 
ages written for them. The new ACEs 
are plug-compatible replacements for 
the Computational Element boards in 
the first-generation FX/8. 

Users can add ACEs, priced at 
$59,000 each, to increase system perfor¬ 
mance and capacity without recompil¬ 
ing or relinking, according to the 
company. 

Alliant offers VAX and workstation 
integration utilities that run on integral 
and compatible Interactive Processors, 
which frees the parallel vector proces¬ 
sors for complex computations. VAX 
integration products include DECnet 
networking, Digital Command Lan¬ 
guage emulation, EDT editor, and the 
X Window System. Workstation inte¬ 
gration software includes NFS, NCS, 
and NeWS. Data communications 
products include TCP/IP support for 
Ethernet and support for the Hyper¬ 
channel. Optional languages include 
FX/C, FX/Ada, and Pascal. 

Alliant has also released previously 
proprietary specifications on its instruc¬ 
tion set for parallel execution. The com¬ 
pany said that it wants to facilitate the 
development of parallel processing 
applications. 
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NCR adds four models to Tower family, offers new LAN family 



The NCR Tower Family of Unix-based supermicrocomputers now includes (left to 
right) the entry-level 32/200, midrange 32/450 and 32/650, and high-end 32/850. 


NCR has announced the expansion of 
the NCR Tower family of Unix-based 
supermicrocomputers with four new 
models. 

The NCR Tower 32/850 becomes the 
most powerful member of the family, 
handling up to 512 users, yet remains 
compatible with the 32/800. The new 
system features support for up to six 
25-MHz 68020-based application 
processors, each with 8M bytes of sys¬ 
tem memory expandable to 16M bytes; 
a new file processor; up to 64M bytes of 
system memory available for each AP; 
40K-byte cache memory and a 68882 
floating-point coprocessor on each AP; 
and an integrated disk capacity of 1.9G 
bytes and an external disk capacity of 
20G bytes. The 32/850 costs will cost 
$106,175, with shipments expected in 
the third quarter of 1988. 

The NCR Tower 32/650 and 32/450 
are new midrange models.The 32/450 
can support 32 users, while the 32/650 
can support 64 users. Both models fea¬ 
ture a 25-MHz 68020-based processor 
with 32K bytes of cache memory. A 
mass storage controller, storage devices, 
languages, tools, and communication 
facilities are available for the new 
midrange models. Prices will be $15,565 
for the 32/450 and $24,915 for the 
32/650. Volume deliveries are scheduled 
for June 1988. 

The NCR Tower 32/200 is an entry- 
level system. It supports up to four ter¬ 
minals, networked PCs, and a parallel 
printer. The 32/200 has a Unix 
5.3-based operating system with stan¬ 
dard communications and networking 
protocols, according to the company. 
The system will come in two configura¬ 
tions: a flex disk-based system and a 
tape-based system. Available in April, 
the 32/200 will cost $5445. 

NCR has also announced an NCR 
Tower local area network family. The 
NCR Tower LAN Family includes the 


NCR Tower File Server, NCR Token- 
Ring Controller, NCR Token-Ring Net- 
bios, and NCR Transmission Control 
Protocol/Internet Protocol via 
Ethernet. 

The file server reputedly provides a 
transparent file system between DOS 
and Unix. It is compatible with 
Expanded Towernet and NCR Token- 
Ring LANs. Available in the third quar¬ 
ter of 1988, the file server will cost 
$1230. 

The token-ring controller connects 
the NCR Tower to a token-ring LAN. 


Available in the second quarter, it will 
cost $2195. 

The token-ring Netbios provides an 
interface between the NCR Tower File 
Server and the token-ring LAN. It will 
be available in the second quarter for 
$530. 

NCR TCP/IP connects NCR Tower 
computers to other vendors’ computer 
systems. It will be available in the sec¬ 
ond quarter. 
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Adobe offers new production tools 


Adobe Systems has announced 
Adobe Illustrator 88, a Postscript-based 
product that incorporates tools of the 
original Adobe Illustrator plus color, a 
blending tool, resolution-independent 
pattern fills, and masking. The package 
also includes a freehand drawing tool. 

According to the company, Illustra¬ 
tor 88 lets illustrators see color images 
on-screen, create color separations for 


output to a Postscript typesetter, and 
print on a color thermal transfer 
printer. The blending tool permits high¬ 
lighting, contouring, airbrushing, and 
shading. 

Shipments of Adobe Illustrator 88 
are scheduled for May. The software 
package will cost $495. 

Adobe also offers the Adobe Illustra¬ 
tor Collector’s Edition, a package of 


pre-built Adobe Illustrator basic 
graphic shapes. It includes 100 borders 
and more than 300 dingbats, plus a 
medium-weight serif and san serif type¬ 
face that can be edited and modified to 
create logotypes. Available in April, the 
package will cost $125. 
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1C Announcements 


Company, Model, Function Comments R.S. No. 


Gould Inc. 
GC Series 
Gate arrays 


Integrated Device 
Technology 
IDT49C460B 
EDC device 


A 1.25-micrometer series of 9 base arrays, part of Gould’s ASIC Continuum line. Typical 120 
gate delay of 0.5 ns with maximum toggle frequency of 100 MHz for a D-type flip-flop. 

Offers 1000 to 14,000 usable gates from sea of gates-based arrays of 3000 to 40,000 gates. 
Packaging includes plastic, ceramic, and TAB with 8 to 280 leads. No prices given. 

A 32-bit error detection and correction device with maximum error detection times of 25 ns 121 
and error correction of 30 ns. A functional replacement for two Am2960s. Cascadable to 
64 bits. Generates check bits on a 32-bit data field according to a modified Hamming code. 
Comes in six 68-pin package options, commercial and military. Cost: $87.80 (100s) for 
shrink DIP. 


Integrated Device 
Technology 
IDT721264L30, 
IDT721265L30 
Floating-point chip set 

LSI Logic 

LDD10000 Direct Drive 

Series 

Gate arrays 


A floating-point chip set: multiplier and arithmetic logic unit. Performs 32-bit operations 122 
at 33.4 Mflops and 64-bit operations at 25 Mflops. Pin compatible with the Weitek 1264 
and 1265. Features 30-ns clock speeds. Conforms to IEEE STD 754 version 10.0 and has 
latency time of 120 ns. Packaged in a 144-pin PGA. Cost: $310 per chip (100s). 


A family of arrays based on Channel-Free architecture. Six masterslices offered, permitting 123 
designs of 8000 to 45,000 usable gates. The company is accepting initial design orders, with 
customer prototype shipments planned for the second half of 1988. Sample devices will be 
available in the second quarter. 


National Semiconductor Nine gate arrays featuring 0.6-ns typical gate delays and ranging from 400 to 15,000 124 

SCX6B00 Series equivalent two-input NAND gates and 28 pads to 200 pads. Offered in through-hole and 

Gate arrays surface-mount packages ranging from 8 to 196 pins. Operate over military temperature 

range. Cost: development charges start at $12,000. 


National Semiconductor A programmable processor for bitmapped graphics systems. Part of the Advanced 125 

DP8500 RGP Graphics Chip Set. A 20-MHz chip that features a 100-ns bus cycle time on back-to-back 

Raster graphics processor vector and block operations. Samples available now; production quantities in the third 
quarter of 1988. Cost: $95 (10,000s). 


NCR 
85C20 
ECC chip 


An error-correcting code chip for optical disk storage devices. Implements a Reed- 126 

Solomon error-correcting code. Licensed from Data Systems Technology, which can sup¬ 
ply supporting software. Supports data transfer rates of up to 24M bps. Programmable for 
three to 10 interleaves. Cost: $23.10 (1000s). 


Signetics Four 20-pin, EPROM-based programmable logic devices designed to replace Series 20 127 

PLC16V8 Series Bipolar PAL ICs. Can be configured to emulate 22 PAL devices in various configurations. 

PLD family A two-level logic element with 10 inputs, 72 programmable AND terms, and 8 output 

macro cells. Cost: in 100s, $5-$9.46 for quarter-power PLC16V8, $2.67-$5.08 for half¬ 
power PLC16V8. 


Standard Microsystems LAN-068 contains the COM9026 LAN Controller, COM90C32 LAN Transceiver, and 128 

LAN-068, LAN-058 HYC9068 LAN Driver. LAN-058 includes the COM9026 LANC, COM90C32 LANT, and 

Arcnet chip sets HYC9058 High Impedance Transceiver. Components are available in plastic. Cost: in 

1000s, $39.60 for LAN-068, $48.05 for LAN-058. 


Texas Instruments Four plastic-packaged one-time programmable PROMs. Available in four densities: 64K 

TMS27PC64, TMS27PC128,(TMS27PC64), 128K (TMS27PC128), 256K (TMS27PC256), and 512K (TMS27PC512). 
TMS27PC256, Device speeds range from 150 ns for TMS27PC64 to 200 ns for the others. Cost: in 100s, 

TMS27PC512 $3.16 for TMS27PC64NL, $3.63 for TMS27PC128NL, $4.34 for TMS27PC256NL, $6.16 

PROMs for TMS27PC512NL. 


129 


Toshiba America A family of lM-bit DRAMs in zip (zig-zag in-line package) organized as lM-bit X 1 130 

TC511000Z, TC514256Z (TC511000Z) and 256K x 4 (TC514256Z) with multiplexed address inputs. Available with 

DRAMs access times of 84 ns, 100 ns, and 120 ns. Cost: in 1000s, $19.25 for TC511000Z-12, $1990 

for TC514256-12. 


Video Seven 
V7VGA 
VGA chip 


A video graphics array chip with video RAM and VGA hardware compatibility. Supports 131 
monochrome and color graphics and text. Operates at dot clock rates up to 65 MHz. 

Comes with 256K- or 512K-byte VRAM and 256K-byte DRAM. Each memory configura¬ 
tion supports an increasing number of colors with varying resolutions. No prices given. 
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Data Services and Servers 
Information System Architecture 
Performance Evaluation 
User Interfaces 


David Du, University of Minnesota 

Publicity: 

James P. Held, University of Minnesota 


Treasurer: 

Douglas K. Barry, Control Data Corp. 

Local Arrangements: 

Homideh Afsarmanesh, Calif. State Univ. D.H. 


Industrial Coordinator 

Elmer Baldwin, Oracle Corp. 


COMMITTEE MEMBERS (TENTATIVE] 



PAPER SUBMISSIONS 

Each paper's length should be limited to 8 proceedings pages, which is about 5000 words, or 
25 double speced typed pages. Five copies of completed papers should be meiled before June 
15, 1988 to: 

Richard L. Shuey, Computer Science Department, Rensselaer Polytechnic Institute, Troy, 
NY, 12189-3590; [518] 276-8376 or (51B] 374-5684; Shuey%MTS@ITSGW.RPI.EDU 
or shuey@ge-crd.arpa 


TUTORIALS 

The day preceding the conference will be devoted to introductory tutorials which may provide 
background for the conference proper. The day following the conference will be devoted to 
advanced tutorials. Proposals for tutorials on Data Engineering topics are welcome. Send 
proposels by June 15,1988 to: 

Mohan Ahuja, Department of Computer Science, The Ohio State University, 2036 Neil 
Avenue, Columbus, OH 43210-1277; [614)292-6377; shuje@ohio-state.arpa 


CONFERENCE TIMETABLE AND INFORMATION 


Papers due: 

Tutorial Proposals Due: 
Acceptance Letter Sent: 
Camera Ready Copy Due: 
Tutorials: 

Conference: 


June 15,1988 
June 15,1988 
September 15, 1988 
November 1, 1988 
February 6 and 10, 1989 
February 7-9, 1989 


For further information contact the General Chairperson, John Carlis, Computer Science 
Department, University of Minnesota, 207 Church Street SE, Minneapolis, MN 55455 [612] 
625-6092; carlis%umn-cs.arpa@relay.cs.net 


AWARDS, STUDENT PAPERS AND SUBSEQUENT 
PUBLICATION: 

Awards will be given to the best paper and to the best student paper [denoted as such when 
submitted solely by students]. The latter will receive the K. S. Fu award honoring one of the 
early supporters of the conference. Up to three grants of S500.00 each will be available to 
help defray travel costs of student authors. Outstanding papers will be considered for publica¬ 
tion in the IEEE Computer Society publications: Computer, Expert, Software, and Transactions 
on Software Engineering, etc. For more information, contact the general chairman. 
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Microsystem Announcements 


Company, Model, Function Comments R.S. No. 


Analog Devices 

RTI-980 

Array processor 

An array processor for Multibus II. Provides numeric processing up to 8 MIPS with 16-bit 135 
fixed-point format. Uses the ADSP-2100 DSP microprocessor to provide ALU, MAC, and 
shift register functions. An 80188 coprocessor manages I/O, DMA, and memory opera¬ 
tions. Software development tools and target environment tools available. Cost: $5995. 

Applied Data Sciences 
PCHSD 

Emulator 

Emulates a Gould HSDII card on an IBM PC-AT or compatible. Features an on-board 136 

FIFO, burst transfer rates of 6.6M-bytes per second to or from the high-speed data inter¬ 
face, bidirectional transfers, and software I/O driver routines. Plugs into a dual connector 
slot in the PC. Cost: $4675. 

Columbia University 
Kermit v. 2.30 

Protocol software 

An error-correcting protocol for transferring sequential files between computers of all sizes 137 
over asynchronous telecommunications lines. Not in the public domain, but not licensed or 
copy protected. Not available for commercial use. Version 2.30 runs on the IBM PC 
family, including PS/2 series, and compatibles. Cost: a distribution fee. 

Flexus International 
CobolspII, v. 1.0 

Cobol screen manager 

A development tool to support prototyping and user interface management for PC Cobol 138 

applications. Allows users to create models without writing Cobol source code, then con¬ 
vert models into Cobol programs. Includes a panel painter, panel file manager, prototyp¬ 
ing tool, runtime unit, and source code generator. Cost: $395. 

Floating Point Systems 
MP32 SSP 

Array processor 

A 32-bit array processor with up to four computational processors and a peak speed of 72 139 

Mflops. Available with Digital Equipment, Data General, and Concurrent interfaces, 
including Qbus, Unibus, VAXBI bus, Burst Multiplex Channel, and BSELCH. Cost: 

$57,500. 

IDR Inc. 

IDR Computer Module 
Single-board computer 

A 386-based, IBM PC-AT compatible, single-board computer module for OEMs and 140 

VARs. Features 2M bytes of zero-wait-state memory, enhanced graphics display adapter, 
hard disk controller, floppy disk controller, two serial communications ports, one parallel 
printer port, and a real-time clock. No price given. 

Macrotech International 
MI386S 

Satellite board 

A 16-MHz 80386 satellite board for the S100 bus. Up to seven satellite boards possible in a 141 
system (more with a custom PAL). Features 1M byte of 32-bit dynamic memory and 1M 
byte of dual-ported, 32-bit, zero-wait-state, four-way interleaved, 100-ns dynamic RAM. 

Optional 80387 math coprocessor. Cost: $2667. 

Motorola 

MVME373 

MAP controller board 

An MC68000-based, VMEbus, Mini-MAP controller board, compatible with MAP physi- 142 

cal connection requirements. Hosts an MC68824 Enhanced Token Bus Controller chip and 
Carrier-band RF Modem chip. Supports either the Mini-MAP three-layer stack or seven- 
layer MAP end-node protocol stack. Scheduled for fourth quarter. Cost: $995. 

Newer Technology 
DartCard 

Memory board 

A modular RAM disk memory expansion system configurable with a variety of disk inter- 143 

faces. Consists of a half- or full-height expansion card rack; plug-in DartCard memory 
boards of 4M, 16M, or 64M bytes; and choice of personality interfaces, including AT-bus 
multifunction, SCSI, and ESDI. No prices given. 

Radius Inc. 

Accelerator 25 

Processor card 

A 25-MHz processor card for the Apple Macintosh SE. Based on the 32-bit Motorola 144 

68020. Features high-speed logic with a write-through hardware cache of 32K bytes of 
static RAM. Works with Apple’s single in-line memory modules. Optional floating-point 
coprocessor, the Motorola 68881. Cost: around $2000. 

Scientific Micro Systems 
SMS 1000 Model 38 
Microcomputer 

A microcomputer based on Digital Equipment’s Q-bus architecture and compatible with 145 

the DEC LSI-11 computer. Includes a Winchester hard disk drive, an LSI-11/23 or 

LSI-11/73 CPU, four or eight serial ports, removable media, and a five-slot quad-height 

Q-bus backplane. Cost: $3000 to $17,000 based on configuration and quantity. 

Waterworks Software 
RAM Lord 

RAM manager 

A RAM management utility. Eliminates software conflicts and memory overload by having 146 
a PC swap TSR/RAM-resident packages in and out of expanded or extended memory or 
from hard disk or floppy disk space. Also eliminates duplicate hot keys. Cost: $99.95. 
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Engineering of Computer-Based vliedical Systems 
June 8-10, 1988 • /Minneapolis, Minnesota 


<^> Engineering in/Medicine and Biology 
Society of the IEEE 

@The Computer Society of the IEEE 
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The symposium will explore the full range of issues and 
problems that emerge from the process of engineering 
computer-based medical systems. Sessions will be held in the 
following areas: 

• hardware architecture and applications 

• reliability, fault tolerance, and design for testability 

• software quality assurance and validity 

• artificial intelligence and expert systems 

• regulatory concerns and strategies 

• chronobiology and closed-loop drug delivery systems 

• software applications 

In addition there will be four pre-conference tutorials: 

• /Medical Computing: The Big Picture 

• Hardware Design for Reliability 

• Software Safety 

• Regulatory Issues and Standards 

A World'lled *88 Event 

This symposium is part of M/orlckVIed '88, the second in a bian¬ 
nual series of major conferences in /Minnesota focusing on 
medical technology and the healthcare industry. /Meeting 
concurrently u/ith the IEEE symposium in the/Minneapolis Hyatt 
Regency are conferences on (1) biomedical technology trans¬ 
fer and (2) competitive business strategies in the healthcare 
industry. Registration for one component of 1/l/orlcM/led '88 
provides entrance to all three, and to a general floor. 

IVorld4ied Keynote Speakers 

• John Villforth, Director, Center for /Hedical Devices and Radiological 
Health, FDA 

‘/Medical Devices and Regulatory Issues” 

• Wilbur Gantz, President, Baxter Healthcare Corp. 

“How Can Technology Mee\ the Needs 
of the Healthcare yVlarket?” 

• We rner Ma ly, Generalbevollmachtiger Direktor, Siemens Medical 
Engineering Group, Erlangen, West Germany 

“Strategic Planning in the kMorld /Market for Medical Products” 


Symposium Committees 

EXECUTIVE COVM/1ITTEE 

Dr. John Long 

Department of Surgery 
University of/Minnesota 
2829 University/Avenue S.E. 
/Minneapolis,/MN 55414 
(612)627-4850 
Dr. Troy Nagle 

North Carolina State University 
Raleigh, NC 
(Representing IEEE CS) 

Dr. Al Potvin 
Eli Lilly 8c Company 
Indianapolis, IN 
(Representing IEEE E44BS) 


PROGRAM CB4IRA/IEN 

Dr. Tim Kriewall 

Sarnes lnc.,/3M 
/Ann /Arbor, /Ml 

Dr. Max Cortner 

Unisys 

St. Paul./MN 
Mr. Greg Sachs 

Biomedicus 
Eden Prairie,/MN 


COOPERATING ORGANIZATIONS 

IEEE Healthcare Engineering 
Policy Committee 
Region 7 IEEE 

Twin Cities Chapters of CS, E44BS 
IEEE CS Technical Committee on 
Computational /Medicine 


ForAlore Information 
or to Register 

Call or write the 

Office of Continuing/Medical Education 
University of/Minnesota 
Box 202 IkMHC 
420 Delaware Street S.E. 
/Minneapolis, /MN 55455 
Telephone (612)626-5525 
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The Computer Society strives to advance the theory and practice 
of computer science and engineering. It promotes the exchange of 
technical information among its 90,000 members around the world, 
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members and nonmembers. 

Membership 

Members receive the highly acclaimed monthly magazine Com¬ 
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Innovator of timed Petri nets keynotes international workshop 


C.M. Woodside, Carleton University 

Carl Adam Petri, whose doctoral the¬ 
sis 25 years ago proposed the model 
that now bears his name, delivered the 
keynote address at the International 
Workshop on Petri Nets and Perfor¬ 
mance Models. His talk, “Tools of 
General Net Theory,” delved into the 
mathematics and the potential applica¬ 
tions of ordinary Petri net models, 
providing a background for later dis¬ 
cussions involving nets with time. 

Timed Petri nets are emerging from 
current research as one of the basic 
approaches for predicting time delay 
and throughput performance in com¬ 
plex systems, including computer and 
communications systems. Ninety 
researchers from nine countries 
attended the 1987 workshop at the Uni¬ 
versity of Wisconsin/Madison to dis¬ 
cuss recent work in the area, including 
applications, ways to make the approach 
more powerful, and ways to combine it 
with better-known performance analysis 
methods like simulation and queueing 
analysis. It was the second meeting in a 
series that began in Turin, Italy, in 1985. 

Petri’s address dealt with three main 
topics: net operations, net morphisms, 
and net topologies. 

The discussion on net operations first 
defined a fundamental operation called 
simplification that produces so-called 
“simple nets.” Other participants 
presented further operations on these 
simple nets and demonstrated them to 
be useful in producing nets with desira¬ 
ble properties. 

In his remarks on the concept of net 
morphisms, Petri characterized a net 
morphism as a mapping of a net to a 
net, which, as a result, preserves some 
important relations among the net com¬ 
ponents. 

In his discussion on net topologies, 
Petri advocated an operational (that is, 
computerizable) approach to topology 
that produces a theory of concurrency 
and provides a new way to generate 
combinatorial models from continuous 
models. 

Applications evolved as a major topic 
of presentations and discussion through¬ 
out the workshop. The applications 


considered included 

• communications protocols for local 
area networks, 

• software performance, including 
parallel tasks with rendezvous, 

• distributed file systems, 

• clock synchronization, 

• VLSI performance, 

• multiprocessors and interconnec¬ 
tion nets for them, and 

• flexible manufacturing systems. 

The applications used timed net 

models to find expected delay in com¬ 
plex sequences of actions; average 
throughput capacities of parallel com¬ 
puters, communications channels, or 
manufacturing systems; and mean times 
to failure, or average failure rates, in 
fault-tolerant designs. In some cases, 
the net models were also used for cor¬ 
rectness analysis. 

The practical use of Petri nets almost 
always requires a computational tool, 
and it was generally agreed that the 
present state of the tools limits the 
applications to small systems and analy¬ 
sis by experts. All the applications 
described related to small systems, 
although most of them were real (rather 
than paper systems) and dealt with real 
engineering design issues. The systems 
must be small because the tools require 
large amounts of space and time, both 
increasing at exponential rates with the 
number of elements in the system. 

Although the models are easy to con¬ 
struct, expertise is required to find 
errors made during modeling. During 
the discussion, it was stated that the 
industry is looking forward to the devel¬ 
opment of methods for debugging net 
models. Despite these difficulties, 
several large companies in Europe and 
North America have begun to develop 
the expertise to use existing net tools 
and are using them in design calculations. 

The search for ways to do performance 
calculations on larger systems, with 
large numbers of interacting processes 
or processors, led to discussion of ways 
to exploit symmetries in some systems 
as well as to approximate solutions. 
Based on various decompositions of 
systems, approximations were described 


as modular-hierarchical, by time scale 
of events, and by functional layering. 
Several groups are concentrating in this 
direction. However, the researchers 
admittedly have little experience as yet 
with the accuracy or robustness of their 
approximations. 

Another direction of current work is 
how to best exploit structural analysis 
of the net—using the extensive mathe¬ 
matical theory developed for ordinary 
(untimed) Petri nets—to get performance 
information. With this approach, work¬ 
shop participants showed that it is pos¬ 
sible to determine the existence of 
bottlenecks (which impose processing 
throughput limits). 

Some computational tools were 
exhibited, including the “design” 
graphical tool from Meta Software, a 
company that also helped fund the 
workshop. Two user-defined tools 
created with the “design” software 
were shown; one is from MIT and the 
other is from the University of Wiscon¬ 
sin at Madison. 

Some of the other tools featured in 
the discussions come from the Politec- 
nico di Torino, Duke University, the 
University of California at Irvine, and 
the University of Michigan at Ann 
Arbor. Some tools have extensive 
graphical capability, including hierar¬ 
chical structuring; some trace execution 
sequences; some do structural analysis 
as well as timings; and several include 
simulators. There were wall-to-wall 
acronyms (GTPN, GSPN, DTPN, 
XSPN, ESPN, etc.) for the different 
flavors of net models implemented in 
the tools. However, the two main varia¬ 
tions use deterministic versus random 
times in describing events and intervals, 
and provide the two main “cultures” of 
the field. 

The proceedings of the workshop are 
available from the Computer Society 
Press, 10662 Los Vaqueros Circle, Los 
Alamitos, CA 90720-2578; (800) CS- 
BOOKS. When ordering, specify cata¬ 
log number 87TH0815-9. 

The next workshop in the series will 
be held in Kyoto, Japan, in December 
1989. 
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Copyright protection appeal, compiler demonstration highlight Micro 20 

Gearold Johnson, Colorado State University 


A call for a more precise software 
and firmware copyright protection defi¬ 
nition, plus an on-line demonstration of 
a retargetable microcode compiler, 
highlighted a four-day Colorado 
Springs, Colo., conference. 

The event was the 20th annual Work¬ 
shop on Microprogramming, jointly 
sponsored by the Computer Society’s 
Technical Committee on Micropro¬ 
gramming (TC Micro) and the ACM’s 
Special Interest Group on Micropro¬ 
gramming (SIGMicro) in December. 
Gearold Johnson, outgoing SIGMicro 
chair, was chair of the workshop. 

Opting against having a keynote 
speaker, Micro 20’s organizers sched¬ 
uled a presentation by Dennis Karjala, a 
law professor at Arizona State Univer¬ 
sity who also has a PhD in electrical 


engineering from the University of 
Illinois. 

In his talk, entitled “Copyright, 
Computer Software, and The New Pro¬ 
tectionism,” he attacked the current 
wave of court cases concerning software 
and firmware and urged the adoption of 
a much more specific definition for 
copyright protection than most soft¬ 
ware developers are now claiming 
through litigation. 

The discussion following the presen¬ 
tation was so lively that it had to be 
adjourned until later in the evening, 
when it continued for another 3% hours. 

Karjala’s paper has been published in 
Vol. 28, No. 1, of Jurimetrics: Journal 
of Law, Science and Technology. 

The retargetable microcode compiler 
demonstration repeated the demonstra¬ 


tion that had taken place during the 
“great microcode compiler cook-off” 
competition part of the workshop. 

Joe Linn of the Institute for Defense 
Analysis and the current TC Micro 
chair had put together a family of 
microarchitectures that was distributed 
as part of the workshop information 
packet. Two of the six family members 
were proposed for the competition: one 
for students and another for industrial- 
grade microcode compilers. 

No student-developed compilers were 
submitted, but Lothar Nowak of the 
University of Kiel in West Germany 
demonstrated the institute’s Mimola 
retargetable microcode compiler on the 
Linn architecture. The entry was 
declared the winner of the competition. 

The software was demonstrated on 


A retrospective on two decades of Microprogramming Workshops 


Edmund L. Gallizzi 

As with many areas of computer 
science, microprogramming has 
enjoyed rapid and continuous growth 
in knowledge. Much of this growth 
has been directly supported by the 
yearly Workshops on Microprogram¬ 
ming that have been conducted over 
the past two decades. 

The workshops draw a small, 
specialized core group of attendees, 
as do many specialized workshops 
and conferences. The inspiration for 
many important ideas and for research 
has grown out of the discussions 
and arguments of small groups of 
participants at the various workshops. 

The series was one of the first to 
be jointly sponsored by the Com¬ 
puter Society and the Association for 
Computing Machinery through the 
respective Technical Committee on 
Microprogramming (TC Micro) and 
the Special Interest Group on Micro¬ 
programming (SIGMicro). 

Gearold Johnson, the SIGMicro 
chair from 1983 to 1987 and the chair 
of Micro 20, outlined the overall tech¬ 
nical history of the workshop as 
starting with an interest in the hard¬ 
ware of microprogrammed systems. 

This interest was engendered at 
Micro 1 in 1968 and continued strong 
until Micro 5 or 6 in the early 70s. It 
was the era of the Nanodata QM1, a 
general-purpose micro and nano- 
programmable host, and the B1700 
with its compiler optimized micropro¬ 
gram and writable control store. 

Johnson’s outline described 
Micros 7 to 9 in the mid-70s as a time 
when the early “pioneering” high- 
level microprogramming languages 


were presented. Of this era, Scott 
Davidson of AT&T, the TC Micro chair 
in 1983-84, said one of the goals of 
the research presented at the work¬ 
shops was to promote the develop¬ 
ment of machine-independent, 
general-purpose, high-level micropro¬ 
gramming languages and systems. 

Some of the pioneering high-level 
languages included S* by Subrata 
Dasgupta, now of the University of 
Southwestern Louisiana; EMPL by 
David DeWitt of the University of Wis¬ 
consin; MPL by Richard Eckhouse, 
now of MOCO, Inc.; STRUM by David 
Patterson of the University of CaTifor- 
nia/Berkeley; and SIMPL by C.V. 
Ramamoorthy, also from UC Berkeley. 

The machine-independent lan¬ 
guage development led to research 
in microcode compaction that pro¬ 
vides a machine-dependent, optimi¬ 
zation mechanism to generate 
microcode from the output of the 
machine-independent, high-level lan¬ 
guage compilers. 

The debate surrounding the 
Reduced Instruction Set Computer 
(RISC) and the Complex Instruction 
Set Computer (CISC) was evident at 
the workshops from Micro 14 to 
Micro 16 in the early 80s, according 
to Johnson. 

Part of the debate centered in the 
“Microcode Shoot-out,” which has 
been a tradition since it was initiated 
several years ago by Will Tracz of 
IBM, the current SIGMicro chair. 

According to Tracz, who has been 
the Shoot-out moderator each year, 
many of the participants “shot from 
the lip.” Patterson, who has made 


important contributions to micropro¬ 
gramming, was part of a heated 
Shoot-out over his new RISC design. 
Typically, RISC designs use no micro¬ 
programming, while the CISC sys¬ 
tems are extensively microprogrammed. 

The latest workshop topics have 
focused on the techniques of firm¬ 
ware engineering, as indicated by the 
Micro 13 keynote address on soft¬ 
ware engineering by William Riddle, 
now director of the Software Produc¬ 
tion Consortium. These software/ 
firmware engineering techniques 
have succeeded in fostering the 
development of the production-quality 
microprogramming systems that 
were demonstrated at the latest 
workshop, Micro 20. 

A particularly successful system 
that grew out of interaction at the 
workshop series is the Micro-C firm¬ 
ware development system of Robert 
Mueller of Colorado State University. 
According to Mueller, a paper that 
Preston Gurd delivered at a Micro 16 
session when Mueller was chair 
sparked the concept for the system’s 
development. 

It’s a production-quality microcode¬ 
development system based on the C 
programming language. The system 
is composed of an assembler, a 
stand-alone optimizer, an ANSI C 
compiler, a symbolic debugger, and a 
simulator. The target host architec¬ 
ture is defined by a common configu¬ 
ration language. 

The current C language-based 
microprogramming systems have 
been very successful. According to 
Davidson, two important reasons for 
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an Apollo DN3000, loaned to the work¬ 
shop by Apollo’s Denver office. Copies 
of Linn’s microarchitecture are avail¬ 
able from the editor of SIGMicro’s 
newsletter. 

Nearly 90 researchers from 10 coun¬ 
tries on four continents participated in 
the workshop, with two-thirds 
representing academic halls and the 
remaining attendees representing indus¬ 
try and government organizations. 

A working meeting as opposed to a 
conference, the workshop had no paral¬ 
lel sessions; instead, attendees were 
encouraged to participate in discussions 
following the presentation of each 
paper and in evening discussion groups 
on active microprogramming topics. 

As usual, the loudest discussions cen¬ 
tered around the microprogramming 
“shoot-out” directed by Will Tracz of 
IBM. This is an annual event. 

Speakers presented 22 papers at the 
workshop, including seven on micro¬ 


this success are YACC and LEX, the 
compiler-generation tools of the Unix 
operating system. These tools have 
allowed the developers to focus on 
the microprogramming issues. 

Maurice V. Wilkes, the developer of 
microprogramming and the leader of 
the group at the Mathematical 
Laboratory of Cambridge that built 
the first operational stored program 
computer, presented keynote addresses 
at Micros 10 and 15. At the latter ses¬ 
sion, Wilkes described the Micropro¬ 
gramming Repository. The University 
of Southwestern Louisiana facility 
was started by Bruce Shriver, who is 
with the IBM T.J. Watson Research 
Center and is the editor-in-chief of 
Computer. 

During the last several workshops, 
from Micro 17 to Micro 20, the impli¬ 
cations of the legal issues of micro¬ 
code copyright have been an important 
area of discussion. 

The currertt state of microprogram¬ 
ming technology is the subject of a 
book entitled Methods of Micropro¬ 
gramming and Firmware Engineering 
edited by Stanley Habib, one of the 
early SIGMicro chairs. The book will 
be available this June. 

Although the ideas microprogram¬ 
ming generates are important in 
themselves, and also pervade many 
other areas of computer architecture, 
a continuing question arises at the 
workshops. The question is: “What is 
microprogramming?” It’s a question 
that persists not because of a lack of 
answers but because of an abundance 
of answers. 


code compaction and optimization tech¬ 
niques, three on pipeline architectures, 
three on microarchitecture tuning, three 
on real implementations, two on devel¬ 
opment systems, and one on applying 
formal methods to writing correct 
microcode. 

Vicki Allan of Utah State University 
presented a very lucid overview of com¬ 
paction as it relates to horizontal 
machines. Her presentation was judged 
the best of the workshop. 

Nowak’s paper, focusing on graph- 
based retargetable microcode compila¬ 
tion in the Mimola design system, 
presented the details of the process of 
retargeting and also provided an over¬ 
view of the entire system. It was a very 
impressive piece of work. 

Fred Homewood of Inmos Ltd., 

U.K., presented a fascinating paper on 
the use of software tools to map Occam 
processes to silicon. In this way, dis¬ 
tributed systems of communicating 
processors are treated as though they 
are simple cells. The use of Occam as a 
higher level language greatly simplifies 
the process of building processor 
elements. 

David Shepherd, also of Inmos Ltd., 
presented a paper on the use of mathe¬ 
matical logic and formal methods to 
write provably correct microcode. 
Researchers have discussed this issue 
for years; it was nice to finally see it in 
use. This focus within Inmos seems to 
be through the influence and direction 
of David May, the master architect of 
both Occam and the transputer. 

Stephen Melvin of the University of 
California/Berkeley presented the 
results of a microcode-based tool analy¬ 
sis for tracing operating system events. 
The presentation included “real” data 
for the VAX 8600. It was an outstand¬ 
ing paper for those looking for 
performance-distribution data. 

To close out the workshop, David 
Archer of Digital Equipment Corpora¬ 
tion presented a paper on the microcode 
aspects of Digital’s CVAX architecture. 

Additional papers dealt with optimi¬ 
zation issues, microcode for Lisp 
machines, pipeline architectures, and so 
on; each was important in its own way. 
In addition, TC Micro and SIGMicro 
conducted a joint evening board 
meeting. 

This year’s Micro 21 will be held in 
San Diego. Additional demonstrations, 
perhaps even the first student entry, are 
sought. 

If all goes according to plan, Micro 
22 in Dublin, Ireland, in 1989 will not 
only be the 22nd Workshop on Micro¬ 
programming but also the First Interna¬ 
tional Workshop on Microprogramming 
and Microarchitecture. 


Phoenix conference 
keynoter to focus on 
AI applications 

John McCarthy of Stanford Univer¬ 
sity will present the keynote address 
when he appears at the seventh annual 
Phoenix Conference on Computers and 
Communications in Scottsdale, Ariz., 
March 16-18. The title of his talk is 
“Current and Future Applications of 
Artificial Intelligence.” 

McCarthy has been a major contribu¬ 
tor to the AI field since the earliest days 
of computing. His interest in AI began 
in 1949. His numerous accomplish¬ 
ments include developing the Lisp pro¬ 
gramming language in 1958, receiving 
the ACM’s A.M. Turing Award in 1971 
and the first Research Excellence 
Award of the International Joint Con¬ 
ference on Artificial Intelligence in 
1985, and election to the National 
Academy of Engineering in 1987. 

The conference will also include two 
days (March 17 and 18) of five concur¬ 
rent tracks covering computers, com¬ 
munications, software, network 
technologies, and AI. A full day of 
tutorials will be presented March 16. 

11th Testability 
Workshop slated 

Focusing on current technology, the 
Computer Society will sponsor the 11th 
annual IEEE Workshop on Design for 
Testability April 19-22 in Vail, Colo. 

Richard L. Lemke, general manager 
of computer-aided engineering for Tek¬ 
tronix, will keynote the workshop. His 
address is entitled “Linking Design and 
Test—Using Virtual Instruments to 
Explore Testibility.” 

“The increasing complexity of both 
electronics products and test equipment 
creates the need for software that facili¬ 
tates verification of design and physical 
prototypes,” Lemke said. “Computer- 
aided test software works with both real 
and virtual instruments. Virtual instru¬ 
ments verify design integrity and testa¬ 
bility. Real instruments verify the 
physical process.” 

Workshop Chair T.W. Williams of 
IBM said other key sessions of the 
workshop will be devoted to the emerg¬ 
ing fields of biased random pattern self 
testing and delay testing techniques. 
Both areas are drawing a great deal of 
interest, Williams said. 

Additional information may be 
obtained by contacting T.W. Williams, 
IBM Corp.—67A/021, 6300 Diagonal 
Hwy, Boulder, CO 80301; (303) 
924-7692. 
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CALENDAR 


March 1988 


1988 International Development Center Con¬ 
ference (DCI), Mar. 13-16, Orlando, Fla. 
Contact Development Center Institute, PO 
44087, Indianapolis, IN 46244-0087; (317) 
846-2753. 

Sixth National Conference on Ada Technol¬ 
ogy, Mar. 14-17, Washington, DC. Contact 
A1 Rodriguez, US Army Communications- 
Electronics Command, AMSEL-RD-LC- 
ASST-1A, Fort Monmouth, NJ 07703; (201) 
532-4725. 


CAIA 88, Fourth International Con- 
viz ference on Artificial Intelligence Appli¬ 
cations, Mar. 14-18, San Diego. Contact Jim 
Miller, MCC, Human Interface Program, 
3500 W. Balcones Center Dr., Austin, TX 
78720; (512) 338-3342, or CAIA 88, Com¬ 
puter Society of the IEEE, 1730 Mas¬ 
sachusetts Ave., NW, Washington, DC 
20036-1903; (202) 371-1013. 


International Conference Extending Data 
Base Technology (AICA, AFCET, BCS), 
Mar. 14-18, Venice, Italy. Contact Joachim 
W. Schmidt, Universitat Frankfurt, Fach- 
bereich Informatik, Dantestrasse 9, D6000 
Frankfurt 1, West Germany; phone (49) 
69-798-8101. 


Securicom 88, Sixth Worldwide Congress on 
Computer and Communications Security 
and Protection, Mar. 15-17, Paris. Contact 
Securicom 88, 8, rue de la Michodiere, 75002 
Paris, France; phone (33) 1-47-424-100. 


PCCC 88, Seventh IEEE Phoenix Con- 
v*7 ference on Computers and Communi¬ 
cations, Mar. 16-18, Scottsdale, Ariz. 
Contact PCCC 88, c/o Carl Ryan, Motorola 
GEG, 2501 S. Price Rd., Chandler, AZ 
85248-2899; (602) 732-3074. 


21st Simulation Symposium (SCS, IMACS), 
Mar. 16-18, Tampa, Fla. Contact Alfred 
Jones, Computer Science Dept., Florida 
Atlantic University, Boca Raton, FL 33431; 
(305) 393-3675. 


Western Educational Computing Workshops 
(CECC), Mar. 17-18, Norwalk, Calif. Con¬ 
tact Alexia Devlin, San Francisco State Uni¬ 
versity, Accounting Data, NADM-358, 1600 
Holloway Ave., San Francisco, CA 94132. 

11th PACS Computer Festival, Mar. 19, 
Philadelphia, Pa. Contact Stephen A. 

Longo, Philadelphia Area Computer Soci¬ 
ety, c/o La Salle University, Philadelphia, 
PA 19141; (215) 951-1255. 


N.C. Contact William A. Smith, Dept, of 
Electrical Engineering, University of North 
Carolina, Charlotte, NC 28223; (704) 
547-4142. 

NCGA 88, Ninth Conference and Exposition 
on Computer Graphics Applications, Mar. 
20-24, Anaheim, Calif. Contact NCGA 88, 
National Computer Graphics Assoc., 2722 
Merrilee Dr., Suite 200, Fairfax, VA 22031. 

Compstan 88, Computer Standards 
*^7 Conference, Mar. 21-23, Arlington, 

Va. Contact James Hall, US Dept, of Com¬ 
merce, National Bureau of Standards, Tech¬ 
nology Bldg. 225, Rm. B266, Gaithersburg, 
MD 20899; (301) 975-3273. 

RIAO 88, User-Oriented Content-Based Text 
and Image Handling Conference (AFIPS), 
Mar. 21-24, Cambridge, Mass. Contact 
RIAO 88 Conference Service Office, Mas¬ 
sachusetts Institute of Technology, Bldg. 7, 
Rm. Ill, Cambridge, MA 02139; or CID, 36 
bis, rue Ballu, 75009 Paris, France. 

Sixth IEEE VLSI Test Workshop, 

'5*7 Mar. 22-23, Atlantic City, N.J. Con¬ 
tact Wesley E. Radcliffe, IBM East Fishkill, 
Dept. 277, Bldg. 321-5E1, Hopewell Junc¬ 
tion, NY 12533; (914) 894-4346. 

AAAI Spring Symposium, Mar. 22-24, Palo 
Alto, Calif. Contact American Assoc, for 
Artificial Intelligence, 445 Burgess Dr., 
Menlo Park, CA 94025-3496; (415) 

328-3123. 

BANKAI 88 (SWIFT), Mar. 22-24, London. 
Contact Jean Steinier, Society for World¬ 
wide Interbank Financial Telecommunica¬ 
tions, PO 2005, Culpeper, VA 22701; (703) 
829-1300. 

CAE India 88, India Computer Graphics 
International Conference and Exhibition, 
Mar. 22-25, New Delhi, India. Contact Tara 
S. Ganguli, Technology and Research 
Associates, 5 Lindsay St., Calcutta 700087, 
India; phone (033) 299-420. 

COIS 88, Conference on Office Infor- 
viz mation Systems (ACM), Mar. 23-25, 
Palo Alto, Calif. Contact Robert Allen, 
2A-367, Bell Communications Research, 
Morristown, NJ 07960; (201) 829-4315. 

Extending Database Technology (IASI, 
^7 INRIA), Mar. 23-25, Venice, Italy. 
Contact Stefano Ceri, Politecnico di Milano, 
Dipart, di Elektronika, Piazza Leonardo da 
Vinci 32, 20133 Milano, Italy; phone 39 (02) 
236-7241. 

Built-In Self-Test Workshop, Mar. 
23-25, Charleston, S.C. Contact 
Richard Sedmak, Self-Test Services, 6 Lin- 


International Congress on CIM Databases, 
Mar. 27-29, Cambridge, Mass. Contact Kim 
Takita, 824 Boylston St., Chestnut Hill, MA 
02167; (617) 232-8080. 

Software Publishers Assoc. 1988 Spring 
Symposium, Mar. 27-30, Berkeley, Calif. 
Contact SPA Conference Registration, 1101 
Connecticut Ave. NW, Suite 901, Washing¬ 
ton, DC 20036; (202) 452-1600. 

Fourth International Conference on Pattern 
Recognition (BPRA, IAPR), Mar. 28-30, 

Cambridge, England. Contact J. Kittler, 
Dept, of Electronic and Electrical Engineer¬ 
ing, University of Surrey, Guildford GU2 
5XH, England, UK. 

Infocom 88, Conference on Computer 
viz Communications, Mar. 28-31, New 
Orleans. Contact Infocom 88, c/o Ron Rut¬ 
ledge, Martin Marietta Energy Systems, MS 
271, Bldg. 4500N, Oak Ridge National 
Laboratories, PO X, Oak Ridge, TN 37831; 
(615) 625-7643. 


April 1988 

ZJv, Applications of Artificial Intelligence 
viz VI, Apr. 4-6, Orlando, Fla. Contact 
Mohan M. Trivedi, University of Tennessee, 
ECE, Ferris Hall, Knoxville, TN 37996-2100; 
(615) 974-5450. 

Conference on the Human Dimension in 
Artificial Intelligence, Apr. 6-9, Lexington, 
Ky. Contact Engineering Continuing Educa¬ 
tion, 223 Transportation Research Bldg., 
University of Kentucky, Lexington, KY 
40506-0043; (606) 257-4295. 

Computer Networking Symposium, 
viz Apr. 11-13, Washington, DC. Contact 
George K. Chang, Bell Communications and 
Research, 6 Colbert PL, Piscataway, NJ 
08854; (201) 699-3879. 

CompEuro 88, Apr. 11-15, Brussels, 
viz Contact Jacques Tiberghien, Vrije 
Universiteit Brussel, Pleinlaan 2, B 1050 
Brussels, Belgium; phone 32 (02) 641-2905. 

ICSE 88, 10th International Confer- 
viz ence on Software Engineering (ACM), 
Apr. 11-15, Raffles City, Singapore. Contact 
Tan Chin Nam or Lim Swee Say, National 
Computer Board, 71 Science Park Dr., NCB 
Building, Singapore 0511; phone (65) 
772-0405. 


PC 88,17th It 




Controllers Conference (ESD, SMI), Apr. 
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12-14, Detroit. Contact IPC 88, 100 Farns¬ 
worth, Detroit, MI 48202; (800) 457-9504. 

VHDL Users’ Workshop, Apr. 17-20, 

Charlottesville, Va. Contact Ron Wax- 
man, University of Virginia, Thornton Hall, 
Charlottesville, VA 22901. 


Georgia Tech Research Institute, O’Keefe 
Bldg, Rm. 40, Atlanta, GA 30332; (404) 
894-3412. 

AutoCAD Expo 88, May 2-5, Chicago. Con¬ 
tact Autodesk, Inc., 2320 Marinship Way, 
Sausalito, CA 94965; (415) 332-2344. 


Floor, Toronto, Ontario, M5T 2Y1; phone 
(416) 593-4040. 

41st SPSE Conference, May 22-27, Washing¬ 
ton, DC. Contact Society for Imaging 
Science and Technology, 7003 Kilworth 
Lane, Springfield, VA 22151. 


Information Systems Conference (ASM), 
Apr. 17-20, San Diego, Calif. Contact 
Assoc, for Systems Management, 24587 
Bagley Rd., Cleveland, OH 44138; (216) 
243-6900. 


1988 IEEE Symposium on Security and 
'5*7 Privacy, Apr. 18-21, Oakland, Calif. 
Contact Dave Bailey, US Dept, of Energy, 
Production Operations Division/CIM 
Office, PO 5400, Albuquerque, NM 87115; 
(505) 846-4600. 

Eastern Simulation Conference (SCS), Apr. 
18-21, Orlando, Fla. Contact Society for 
Computer Simulation, PO 17900, San 
Diego, CA 92117; (619) 277-3888. 

Workshop on Factory Communica- 
'5*7 tions, Apr. 19-20, Gaithersburg, Md. 
Contact Alfred C. Weaver, Dept, of Com¬ 
puter Science, University of Virginia, Char¬ 
lottesville, VA 22903; (804) 979-7529. 

£2n 11th IEEE Workshop on Design for 
*54? Testability, Apr. 19-22, Vail, Colo. 
Contact T.W. Williams, IBM Corp., PO 
1900, Dept. 67A/021, Boulder, CO 80301- 
9191; (303) 924-7692. 

£2^1 Second International Conference on 
'5*7 Expert Database Systems, Apr. 25-28, 

Tysons Corner, Va. Contact Edgar H. 

Sibley, George Mason University, ICSE 
Dept., 4400 University Dr., Fairfax, VA 
22030. 


Working Conference on Parallel Processing 
(IFIP), Apr. 25-27, Pisa, Italy. Contact 
Michel Cosnard, Laoratoire TIM3, Institut 
National Polytechnique de Grenoble, 46 
Avenue Felix Viallet, 38031 Grenoble Cedex, 
France; phone (33) 7643-3726. 

IEEE International Conference on Robotics 
and Automation, Apr. 25-29, Philadelphia. 
Contact Harry Hay man, 738 Whitaker Ter., 
Silver Spring, MD 20901; (301) 434-1990. 

Sococo 88, Symposium on Software for 
Computer Control (IFAC, IFIP), Apr. 
26-28, Johannesburg, South Africa. Contact 
IFIP, 3 rue de Marche, CH-1204, Geneva, 
Switzerland. 


^2^ Second Parallel Processing Symposium 
Kgy (California State University at Fuller¬ 
ton), Apr. 28-30, Fullerton, Calif. Contact 
Larry Canter, 1619, N. Hale, Fullerton, CA 
92631; (714)738-3414. 


May 1988 

Machine Vision Hands-On Workshop, May 
2-3, Atlanta. Contact Constantin Soulakos, 


Second Military Computing Conference, 

May 3-5, Anaheim, Calif. Contact Military 
Computing Conference, PO 428, Los Altos, 
CA 94023; (415) 494-2800. 

Artificial Intelligence and Advanced Com¬ 
puter Technology Conference and Exhibition 

(SCS), May 4-6, Long Beach, Calif. Contact 
Tower Conference Management Co., 800 
Roosevelt Rd., Glen Ellyn, IL 60137-5835; 
(312)469-3373. 

19th Pittsburgh Conference on Modeling 
and Simulation (IEEE, ISA, SCS), May 5-6, 
Pittsburgh. Contact William G. Vogt or 
Marlin H. Mickle, Modeling and Simulation 
Conference, 348 Benedum Engineering Hall, 
University of Pittsburgh, Pittsburgh, PA 
15261. 


Workshop on Parallel and Distributed 
Debugging (ACM), May 5-6, Madison, Wis. 
Contact Bart Miller, Computer Sciences 
Dept., University of Wisconsin, 1210 W. 
Dayton St., Madison, WI 53706; (608) 
263-3378. 


£2^ Fourth International Software Process 
^*7 Workshop, May 11-13, Devon, Eng¬ 
land. Contact Leon Osterweil, Computer 
Science Dept., University of Colorado, 
Campus Box 430, Boulder, CO 80309; (303) 
492-8787. 


^2N Fifth Workshop on Real-Time Operat- 
'5*7 ing Systems, May 12-13, Washington, 
DC. Contact John A. Stankovic, Dept, of 
Computer and Information Science, Univer¬ 
sity of Massachusetts, Amherst, MA 01003; 
(413) 545-0720. 

Software Maintenance Assoc. Conference, 
May 15-18, Chicago. Contact Suzanne 
Grenoble, SMA, Box 391432, Mountain 
View, CA 94039;(408) 730-1132. 

CHI 88, Conference on Human Factors in 
Computing Systems (ACM), May 15-19, 
Washington, DC. Contact Gail A. Chu- 
mura, 5214 Monroe Dr., Springfield, VA 
22151; (703) 750-9401. 

Second Symposium on Space C 3 (AFCEA), 
May 16-19, Annapolis, Md. Contact Randy 
Crawford/DQ, IIT Research Institute, 185 
Admiral Cochrane Dr., Annapolis, MD 
21401; (301) 224-2295. 

NATO Advanced Research Workshop on 
Highly Redundant Sensing in Robotic Sys¬ 
tems, May 16-21, Tuscany, Italy. Contact 
Julius Tou, Center for Information 
Research, University of Florida, Gainesville, 
FL 32611; (904) 335-8018. 

CIPS Congress 88, May 18-20, Toronto, 
Canada. Contact Canadian Information 
Processing Society, 243 College St., 5th 


Third International IEEE Conference 
v*7 on Ada Applications and Environ¬ 
ments, May 23-26, Manchester, N.H. Con¬ 
tact Derek S. Morris, Dept, of Electrical 
Engineering and Computer Science, Stevens 
Institute of Technology, Hoboken, NJ 
07730; (210) 420-5606. 


SID 88, May 23-27, Anaheim, Calif. Contact 
James N. Price, Naval Ocean Systems Cen¬ 
ter, Attn.: Code 713, San Diego, CA 92152; 
(619) 225-2665. 


18th International Symposium on 
Multiple-Valued Logic, May 24-26, 

Madrid, Spain. Contact Enric Trillas, Inves- 
tigaciones Cientificas, Serrano 117, 28006- 
Madrid, Spain; phone 34 (91) 621-6264. 


CG International 88 (CGS, BCS), May 
24-27, Geneva. Contact N. Magnenat- 
Thalmann, MIRALab HEC, 5255 Decelles, 
Montreal, Canada H3T 1V6. 


SIGMetrics Conference on Measurement and 
Modeling of Computer Systems (ACM), 

May 24-27, Santa Fe, N.M. Contact Connie 
U. Smith, L and S Computer Technology, 
1114 Buckman Rd., Santa Fe, NM 87501; 
(505) 988-3811. 


International Conference on Systolic 
'5*7 Arrays, May 25-27, San Diego. Con¬ 
tact Keith Bromley, Code 741-T, Naval 
Ocean Systems Center, 271 Catalina St., San 
Diego, CA 92152; (619) 225-7008. 


International Workshop on Artificial Intelli¬ 
gence for Industrial Applications (IEEE, 
SICE), May 25-27, Hitachi, Japan. Contact 
Takao Sasayama, Technology Planning 
Group, Hitachi America Ltd., 50 Prospect 
Ave., Tarrytown, NY 10591-4698; (914) 
332-5800, or Kotaro Hirawawa, 10th Dept., 
Hitachi Research Laboratory, Hitachi, Ltd., 
4026 Kuji-cho Hitachi-shi, 319-12, Japan; 
(02)94-52-5111. 


15th IFAC Workshop on Real-Time Pro¬ 
gramming, May 25-27, Valencia, Spain. 
Contact WRTP 88, Grupo de Informatica 
Industrial—DSIC/DISCA, Universidad 
Politecnica de Valencia, PO 22012, E-46071 
Valencia, Spain; phone (34) 6-360-4041. 


Eurocrypt 88, Workshop on the Theory and 
Application of Cryptologic Techniques 
(IACR), May 25-27, Davos, Switzerland. 
Contact Paul Schobi, Ltd., Althardstr. 70, 
CH-8105 Regensdorf, Switzerland. 


£3^ 15th International Symposium on 
^*7 Computer Architecture (ACM), May 
30-June 3, Honolulu. Contact H.J. Siegel, 
Supercomputing Research Center, 4380 
Forbes Blvd., Lanham, MD 20706; (301) 
731-3700. 
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CALL FOR PAPERS 


Call for papers for Computer 

Computer seeks articles for the special January 1989 theme issue on Real Machines: 
Design Choices/Engineering Tradeoffs. 

Six copies of each complete manuscript should be submitted by May 15,1988 to 
Yale Patt, Computer Guest Editor, College of Engineering, Dept, of Electrical 
Engineering and Computer Science, 573 Evans Hall, University of California at 
Berkeley, Berkeley, CA 94720. 


IEEE Software'. Articles are sought for 
a special issue on software validation 
and verification describing the emerging 
technologies and the underpinnings of this 
proven technology. Submit article to Dolores 
Wallace, Guest Editor, National Bureau of 
Standards, Bldg. B266, Gaithersburg, MD 
20899; (301) 975-3340, or to Ted G. Lewis, 
Editor-in-Chief, IEEE Software, c/o Com¬ 
puter Science Dept., Oregon State Univer¬ 
sity, Corvallis, OR 97331; (503) 754-2744; 
CSnet lewis@oregon-state; Compmail + 


tgfjt IEEE Software-. Articles are sought for 
a special issue that show how working 
systems use rapid-prototyping technology, 
what issues remain, and how rapid prototyp¬ 
ing interacts with other areas. Submit paper 
to Giorgio Bruno, Guest Editor, Dip. 
Automatica e Informatica, Politecnico di 
Torino, Corso Duca degli Abruzzi 24, 10129 
Torino, Italy; phone 39 (11) 556-7003, or to 
Ted G. Lewis, Editor-in-Chief, IEEE Soft¬ 
ware, c/o Computer Science Dept., Oregon 
State University, Corvallis, OR 97331; (503) 
754-2744; CSnet lewis@oregon-state; 
Compmail + t. lewis. 


IEEE Transactions on Reliability: Special 
issue on reliability of parallel and distributed 
computing networks scheduled for publica¬ 
tion in late 1988. Submit contributions to 
Dharma P. Agrawal or Suresh Rai, Dept, of 
Electrical and Computer Engineering, PO 
7911, North Carolina State University, 
Raleigh, NC 27695-7911; (919) 737-2336. 


Supercomputing 88: Nov. 14-18, 1988, 
Orlando, Fla. Submit paper by Mar. 

14, 1988 to Stephen F. Lundstrom, ERL 455, 
Stanford University, Stanford, CA 94305; 
(415) 723-0140. 


Summer Computer Simulation Conference 

(SCS): July 25-27, 1988, Seattle. Submit 
complete manuscript by Mar. 15, 1988 to 
Charles A. Pratt, PO 17900, San Diego, CA 
92117; (619)277-3888. 


Eighth International Conference in Com¬ 
puter Science: July 4-8, 1988, Santiago, 
Chile. Submit abstract by Mar. 15, 1988 and 
final paper by June 1, 1988 to Alberto O. 
Mendelzon, CSRI, University of Toronto, 10 
King’s College Rd., Toronto, Canada M5S 
1A4. 


IEEE Expert is seeking materials for 
vs? publication. Submit articles on tech¬ 
nology and AI applications to David Pessel, 
Editor-in-Chief, BP America, 4440 Warrens- 
ville Center Rd., Cleveland, OH 44128; 
reports on conferences, short subjects and 
papers on PCs, products, and resources to 
Henry Ayling, Managing Editor, IEEE 
Expert, 10662 Los Vaqueros Cir., Los 
Alamitos, CA 90720; and book reviews to 
K.S. Shankar, Associate Editor, Federal Sys¬ 
tems Division, IBM Corp., 3700 Bay Area 
Blvd., Houston, TX 77058. 


/£3^| Technical Committee on Computer 
Education, Computer Society of the 
IEEE: Contributions are welcomed for the 
TCCE newsletter, a forum for the exchange 
of ideas among persons interested in com¬ 
puter education or computers in education. 
Submit news items, short articles, and cor¬ 
respondence to Helen Hays, Dept, of Com¬ 
puter Science, Southeast Missouri State 
University, Cape Girardeau, MO 63701; 
(314)651-2244. 


OOPSLA 88, Conference on Object- 
Oriented Programming: Systems, Lan¬ 
guages, and Applications (ACM): Sept. 
25-29, 1988, San Diego. Submit paper by 
Mar. 15,1988 to Kurt Schmucker, OOPSLA 
88, c/o Apple Computer, 5950 Symphony 
Woods Rd., Suite 410, Columbia, MD 
21044. 


® International Software for Strategic 
Systems Conference: Oct. 25-26, 1988, 
Huntsville, Ala. Submit abstract by Mar. 15, 
1988 to Wayne Smith, 635 Discovery Dr., 
Huntsville, AL 35806; (205) 721-1941. 


GOMAC 88, Government Microcircuit 
Applications Conference: Nov. 8-10, 1988, 
Las Vegas, Nev. Submit summary by Mar. 
16, 1988 to Jay Morreale, Palisades Institute 
for Research Services, G-88, 201 Varick St., 
Rm. 1140, New York, NY 10014. 


CSM 88, Conference on Software 
Maintenance—1988 (DPMA. NBS): 
Oct. 24-27, Phoenix, Ariz. Submit paper by 


Mar. 18, 1988 to Wilma M. Osborne, 
National Bureau of Standards, Bldg. 225, 
Rm. B266, Gaithersburg, MD 20899; (301) 
975-3339. 

IEEE Software and IEEE Expert: 
v*? November 1988. Papers are sought for 
special issues on expert systems applied to 
software engineering. Submit manuscript by 
Apr. 1, 1988 to Murat Tanik, Dept, of Com¬ 
puter Science and Engineering, Southern 
Methodist University, Dallas, TX 
75275-0122; (214) 692-3083, ext. 2854. 

fffjl Seventh Symposium on Reliable Dis- 
vS? tributed Systems: Oct. 10-12, 1988, 
Columbus, Ohio. Submit manuscript by 
Apr. 1, 1988 to Jane W.S. Liu, Dept, of 
Computer Science, University of Illinois, 
1304 W. Springfield Ave., Urbana, IL 
61801-2987; (217)333-0135. 

ESIG 88, Fourth Expert Systems in 
Government Conference: Oct. 17-21, 
1988, Washington, DC. Submit paper by 
Apr. 1, 1988 to ESIG 88, MS W418, Mitre 
Corp., 7525 Colshire Dr., McLean, VA, 
22102. 

LFA 88, Sixth IEEE Workshop on 
Languages for Automation: Aug. 

29-31, 1988, College Park, Md. Submit paper 
by Apr. 1, 1988 to Panos A. Ligomenides, 
Electrical Engineering Dept., University of 
Maryland, College Park, MD 20742. 

CCC 89, Second Hungarian Custom Circuit 
Conference (MATE): May 10-12, 1989, 
Szeged, Hungary. Submit paper by Apr. 1, 
1988 to MATE Secretariat, 1055 Budapest, 
Kossuth L. ter 6-8, Hungary; phone (1) 
531-406. 

ICNN 88, IEEE 1988 International Confer¬ 
ence on Neural Networks: July 24-27, 1988, 
San Diego. Submit paper by Apr. 1,1988 to 
Nomi Feldman, IEEE ICNN 88, 3770 Tansy 
St., San Diego, CA 92121; (619) 453-6222. 

International Workshop on Defect and 
Fault Tolerance in VLSI Systems: Oct. 
6-7, 1988, Springfield, Mass. Submit 
extended summary by Apr. 5, 1988 to Israel 
Koren, Dept, of Electrical and Computer 
Engineering, University of Massachusetts, 
Amherst, MA 01003; (413) 545-2643. 

CASE 88, Second International Work- 
shop on Computer-Aided Software 
Engineering (ACM): July 12-15, 1988, Cam¬ 
bridge, Mass. Submit position paper by Apr. 
11, 1988 to Pamela Meyer, CASE 88, c/o 
Index Technology Corp., One Main St., 
Cambridge, MA 02142; (617) 494-8200, ext. 


" guages: Oct. 10-12, 1988, Pittsburgh. 
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Submit paper by Apr. 15, 1988 to Alfs T. 
Berztiss, Computer Science Dept., Faculty of 
Arts and Sciences, University of Pittsburgh, 
322 Alumni Hall, Pittsburgh, PA 15260; or 
(in Europe) to Stefano Levialdi, Depar- 
timento di Matematica, Universita di Roma, 
Piazzale A. Moro, 00185, Rome, Italy. 

ICCAD 88, IEEE International Con- 
v87 ference on Computer-Aided Design 

(ACM): Nov. 7-10, 1988, Santa Clara, Calif. 
Submit paper by Apr. 29, 1988 to ICCAD 88 
Secretary, Electrical and Computer 
Engineering Dept., Carnegie Mellon Univer¬ 
sity, Pittsburgh, PA 15213; (412) 268-3546. 

£2^ 13th Conference on Local Computer 
Networks: Oct. 10-12, 1988, Min¬ 
neapolis, Minn. Submit paper by May 1, 

1988 to Larry Green, Advanced Computer 
Communication, 720 Santa Barbara St., 

Santa Barbara, CA 93101; (805) 963-9431. 

£2N Ninth Real-Time Systems Symposium: 
'5*7 Dec. 6-8, 1988, Huntsville, Ala. Sub¬ 
mit paper by May 1, 1988 to A1 K. Mok, 

Dept, of Computer Science, TAY 3-140C, 
University of Texas/Austin, Austin, TX 
78712; (512)471-9542. 

Z2^ International Symposium on Databases 
*5*9 in Parallel and Distributed Systems: 

Dec. 5-7, 1988, Austin, Texas. Submit manv 
script by May 1, 1988 to Won Kim, 3500 Bal- 
cones Center Dr., Austin, TX 78759; (512) 
338-3439. 

£2^ ICCV 88, Second International Con- 
KS? ference on Computer Vision: Dec. 5-8, 
1988, Tarpon Springs, Fla. Submit paper by 
May 15,1988 to Ruzena Bajcsy, University 
of Pennsylvania, Dept, of Computer and 
Information Science, 200 S. 33rd St., 
Philadelphia, PA 19104-6389; (215) 

898-6222. 

^2^ IEEE Software: January 1989. Articles 
v*7 are sought on human-computer inter¬ 
faces, human factors, and user issues. Sub¬ 
mit articles by June 1,1988 to Virginia Hix, 
Guest Editor, Computer Science Dept., Vir¬ 
ginia Polytechnic State University, Black¬ 
sburg, VA 24061; (703) 961-4857, or Ted G. 
Lewis, Editor-in-Chief, IEEE Software c/o 
Computer Science Dept., Oregon State Uni¬ 
versity, Corvallis, OR 97331; (503) 754-2744; 
CSnet lewis@oregon-state; Compmail + 
t. lewis. 

£2^ 1988 International Computer Science 
^87 Conference: Dec. 19-21, 1988, Hong 
Kong. Submit paper by June 15,1988 to 
Jean-Louis Lassez, IBM T.J. Watson 
Research Center, PO 218, Yorktown 
Heights, NY 10598, or Francis Y.L. Chin, 
Center of Computer Studies and Applica¬ 
tion, University of Hong Kong, Hong Kong. 

Proceedings of the IEEE: January 1990. A 
special issue on state-of-the-art surveys on 
supercomputing technology is planned. Sub¬ 
mit paper or proposal by June 30, 1988 to 
Tse-yun Feng or A.R. Hurson, E.E. East 
Bldg., Pennsylvania State University, Univer¬ 
sity Park, PA 16802; (814) 863-1469 or 1187. 


CAREER OPPORTUNITIES 


CONCORDIA UNIVERSITY 
Department of Electrical and 
Computer Engineering 

The department invites applications for tenure 
track faculty positions at the Assistant, Associ¬ 
ate or Full Professor levels in all areas of Com¬ 
puter Engineering. Our primary interest is in can¬ 
didates with Computer Architecture background, 
particularly in the field of parallel or distributed 
processing. Candidates should have a strong in¬ 
terest in research as well as commitment to 
teaching at the undergraduate levels in Com¬ 
puter Hardware related courses. Medium of in¬ 
struction is English. An earned doctorate is re¬ 
quired. Degrees in Computer Engineering are of 
special interest. Salary and benefits are ex¬ 
tremely attractive and negotiable. The depart¬ 
ment provides an excellent environment and op¬ 
portunity for growth with established research 
groups in many areas. The department and the 
university have excellent computer facilities, 
modern teaching and research labs and support 
staff. Interested candidates should submit 
resumes with the names of three referees to: 
Jeremiah H. Hayes, Chairman, Electrical and 
Computer Engineering Department, CONCOR¬ 
DIA UNIVERSITY, 1455 de Maisonneuve Blvd. 
W., Montreal, Quebec, Canada H3G IM8. In ac¬ 
cordance with Canadian immigration require¬ 
ments, this advertisement is directed to Cana¬ 
dian citizens and permanent residents. 


UNIVERSITY OF MISSOURI-KANSAS CITY 

And the beat goes on.. .in 1987 more faculty 
and staff were added, our new space was 
remodelled again to accommodate our continu¬ 
ing growth, a Center for Advanced Technology in 
Telecommunications and Computer Networking 
is being established within our unit, courses and 
degree programs are being televised and re¬ 
search equipment is being added to our labs. 
Telecommunications, Computer Networking, Ar¬ 
tificial Intelligence and Software Studies are the 
four groupings of our research and instructional 
programs. If your work is in one of these areas 
and you are looking for a stimulating environ¬ 
ment, please contact us. 

Nine- or twelve-month appointments at all ranks 
are available at industry competitive salaries. 
Visiting and post-doctoral appointments are also 
open. Research productivity is required so time 
for research is guaranteed and increased as your 
research performance increases. Through the 
Center for Advanced Technology, substantial 
support for faculty and graduate student re¬ 
search in telecommunications and computer net¬ 
working will be generated. Perhaps you are one 
of the few outstanding people we will be adding 
to our program this year. 

Call or write: Dr. Richard G. Hetherington, Direc¬ 
tor, Computer Science, University of Missouri— 
Kansas City, 4747 Troost, Room 207, Kansas 
City, MO 64110. (816) 276-1193 (by April 15 for 
Fall semester appt.; by October 15 for Spring 
semester appt.) UMKC is an Equal Opportunity/ 
Affirmative Action Employer. 


RATES: $12.00 per line, $120 minimum charge 
(up to ten lines). Average six typeset words 
per line, nine lines per column inch. Add $10 
for box number. Send copy at least one 
month prior to publication to: Heidi Rex or 
Marian Tibayan, Classified Advertising, 
COMPUTER Magazine, 10662 Los Vaqueros 
Circle, Los Alamitos, CA 90720; (714) 
821-8380. 

In order to conform to the Age Discrimina¬ 
tion in Employment Act and to discourage 
age discrimination, COMPUTER may reject 
any advertisement containing any of these 
phrases or similar ones: "...recent college 
grads...," "...1-4 years maximum experi¬ 
ence...,” “...up to 5 years experience...," or 
"...10 years maximum experience." COM¬ 
PUTER reserves the right to append to any 
advertisement, without specific notice to the 
advertiser, “Experience ranges are sug¬ 
gested minimum requirements, not maxi- 
mums.” COMPUTER assumes that, since 
advertisers have been notified of this policy 
in advance, they agree that any experience 
requirements, whether stated as ranges or 
otherwise, will be construed by the reader as 
minimum requirements only. 


THE UNIVERSITY OF TENNESSEE 
SPACE INSTITUTE 

Faculty Positions in Computer Science 

Applications are invited for tenure track posi¬ 
tions at all levels. A Ph.D. in computer science or 
a closely related area and a commitment to re¬ 
search and teaching are required. Candidates 
from all areas of computer science will be con¬ 
sidered; however, preference will be given to 
candidates with expertise in artificial intelli¬ 
gence or robotics. UTSI offers M.S. and Ph.D. 
degrees in computer science with emphasis on 
applied artificial intelligence, expert systems 
and robotics. 

UTSI is a multidisciplinary graduate institute of¬ 
fering degree programs and research in engi¬ 
neering, physics, applied mathematics and com¬ 
puter science. Emphasis is placed on research 
and graduate-level instruction. In addition to a 
VAX/780 and VAX/785, two Symbolics 3600 Lisp 
machines are available for research as well as 
linkages to computers at The University of Ten¬ 
nessee in Knoxville and supercomputers at vari¬ 
ous locations. 

Rank and salaries for these positions are open 
and commensurate with qualifications. Fringe 
benefits include group life and medical in¬ 
surance, TIAA/CREF, and reduced tuition for 
dependents. UTSI occupies a scenic 365 acre 
lakeshore campus. 

Send a detailed resume and names of three refer¬ 
ences to: Professor Moonis Ali, Chairman, Com¬ 
puter Science Committee, The University of Ten¬ 
nessee Space Institute, Tullahoma, TN 37388. 
(phone: (615) 455-0631, ext. 283). 

UTSI is an AA/EEO employer 
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UNIVERSITY OF HONG KONG 
Lecturer in Computer Studies 

Applications are invited for a Lectureship in the 
Centre of Computer Studies and Applications. 
Candidates should have a higher degree in Com¬ 
puter Science, Computer Engineering or Infor¬ 
mation Systems, and a strong interest in both 
teaching and research. Preference will be given 
to applicants with strong interest in research 
and development. Consideration may be given to 
applications for appointment on a short-term 
basis (please specify period). A certain amount 
of outside practice is permitted. 

Annual salary (superannuable) is on an 11-point 
scale: HK$188,040-314,340 (US $1 = HKS7.80 
as at December 21 1987). Starting salary will de¬ 
pend on qualifications and experience. At cur¬ 
rent rates, salaries tax will not exceed 16V2 % of 
gross income. Childrens education allowances, 
leave, and medical benefits are provided; hous¬ 
ing or tenancy allowances are also provided in 
most cases at a charge of 71/2% of salary. 
Further particulars and application forms may be 
obtained from the Secretary General, Associa¬ 
tion of Commonwealth Universities (Appts), 36 
Gordon Square, London WC1H OPF, or from the 
Appointments Unit, Registry, University of Hong 
Kong, Hong Kong. Closes 31 March 1988. 


UNIVERSITY OF WISCONSIN-MADISON 
Faculty Positions 

The Department of Electrical and Computer Engi¬ 
neering invites applications for tenure and tenure- 
track positions. A Ph.D. degree is required, and 
successful candidates are expected to partici¬ 
pate in both teaching and research activities. Ap¬ 
plicants in all areas of computer engineering are 
invited to apply, but the following areas are of 
special interest: computer architecture, computer 
networks, VLSI and computer-aided design, 
microprocessor and minicomputer applications, 
real-time control and instrumentation applica¬ 
tions, and engineering applications of artificial 
intelligence. Rank and salary will be commen¬ 
surate with qualifications and experience. Send 
resume and names of three references to J. Leon 
Shohet, Chairman, Department of Electrical and 
Computer Engineering, University of Wiscon¬ 
sin—Madison, 1415 Johnson Drive, Madison, Wl 
53706, an equal opportunity/affirmative action 
employer. 


UNIVERSITY OF WISCONSIN-MILWAUKEE 
Department of Electrical Engineering and 
Computer Science 

The Department expects to have Computer 
Science faculty openings. Candidates should 
have a Ph.D. in Computer Science or a closely 
related discipline. A strong commitment to 
research and teaching is expected. Senior can¬ 
didates should have an excellent research 
record. Areas of special interest are: artificial in¬ 
telligence, networks, compilers, software engi¬ 
neering, and theory. 

The Department’s current research strengths in¬ 
clude theory, architecture, parallel computation, 
data-security ang databases. The University 
campus is located near the shores of Lake 
Michigan and is close to pleasant and beautiful 
residential areas. Interested persons should 
send a resume with three references to: 
Professor K. Vairavan 
Chairman—Computer Science 
Dept, of Elec. Eng. and Comp. Sci. 

UW—Milwaukee 
P.O. Box 784 
Milwaukee, Wl 53201 


WORCESTER POLYTECHNIC INSTITUTE 

The Computer Science Department invites appli¬ 
cations for tenure track faculty positions at all 
levels from candidates in all areas of specializa¬ 
tion. Candidates should have a Ph.D. in Com¬ 
puter Science and a strong interest in both re¬ 
search and teaching. 

Worcester Polytechnic Institute emphasizes 
quality in the undergraduate learning experience 
and is committed to an innovative project- 
oriented teaching environment. The quality of 
the undergraduate computer science degree has 
been recognized by the recent granting of ac¬ 
creditation by the Computer Sciences Accredita- 

The current goal of the Institute is to enhance 
our graduate program and improve research ac¬ 
tivities. The department seeks qualified can¬ 
didates who will help us achieve these objec¬ 
tives. WPI is located close to the center of 
Massachusetts’ minicomputer industry and ex¬ 
cellent opportunities exist for cooperative re¬ 
search and consulting. 

The Department has 12 full-time faculty with 200 
undergraduates and 50 full-time and 120 part- 
time graduate students in our M.S. and Ph.D. pro¬ 
grams. Department equipment includes 4 VAX 
750’s, an AT&T 3B-15, 3 Sun Workstations, and 
over 80 PCs (both UNIX and MS-DOS based). 
Much of this equipment is networked via two 
ethernet cables and is also connected to other 
campus facilities, which includes DEC2060and 
numerous VAXen. The Institute is committed to 
a new Information Science building and full cam¬ 
pus networking facilities in the near future. 
Located only 45 miles from Boston, Worcester is 
a small city of 180,000 which has recently under¬ 
gone a renaissance. It has eight colleges and 
universities and a rich variety of cultural 
activities. 

Please send a resume to Karen Lemone, Acting 
Department Head, Department of Computer Sci¬ 
ence, WPI, Worcester, MA 01609. 

WPI is an Equal Opportunity/Affirmative Action 
Employer. 


NEW JERSEY INSTITUTE OF TECHNOLOGY 
Computer and Information Science 

Department seeks assistant, associate and full 
professors for spring/fall 1988. Ph.D. in computer 
science or closely related field required. Senior 
level applicants must have proven research and 
funding record. Positions available in, but not 
limited to: distributed computing including com¬ 
puter architecture, data communications net¬ 
working, realtime computing and fault tolerance; 
software development including artificial intelli¬ 
gence, expert systems, computer graphics; office 
automation, data management systems, cog¬ 
nitive science, and computational linguistics. 
Department offers B.S., B.A., M.S., and Ph.D., in 
computer science, and Ph.D. in management 
jointly with Rutgers-Newark. Computing facili¬ 
ties include VAX 8800, VAX 8530, IBM 4361, SUN 
workstations, Symbolics machines, Tl Explorers 
and graphics systems. 

N JIT is the comprehensive technological univer¬ 
sity of New Jersey with 7700 students enrolled in 
Newark College of Engineering, the School of Ar¬ 
chitecture, and College of Sciences and Liberal 

Send resume and names of at least three refer¬ 
ences to: 

Personnel Box CIS 
New Jersey Institute of Technology 
Newark, NJ 07102 EO/AA employer 
NJIT does not discriminate on the basis of sex, 
race, color, handicap, national or ethnic origin, or 
age in employment. 


UNIVERSITY OF BRIDGEPORT 
Computer Engineering 

Tenure track positions in Computer Engineering 
available September 1988. Candidates should 
have strong commitment to computer engineer¬ 
ing education plus strong background in any of 
the following: computer hardware, digital de¬ 
sign, VLSI design, image processing, pattern 
recognitive and computer organization among 
others. Ph.D. in Computer Engineering preferred 
but degrees in related areas with extensive com¬ 
puter experience will be considered. The Univer¬ 
sity is located in prestigious Fairfield County, 
Connecticut within easy driving distance of New 
York and Boston. The primary goal of the engi¬ 
neering college is instruction, but we are pre¬ 
sently developing a larger research emphasis 
through the Connecticut Technology Institute, 
initially funded through a five million dollar 
federal grant. Facilities include VAX 785, DEC 
20/60, PC lab, VLSI design lab, Al workstations 
lab, digital signal processing lab using Tl 320 
chips connected to PC's, Z80 microcomputer 
hardware lab, and 8086 microcomputer software 
lab. Send resume and at least three current 
references to: Dr. Stephen Grodzinsky, Depart¬ 
ment of Computer Science and Engineering, 
University of Bridgeport, Bridgeport, CT 06601. 

An Equal Opportunity Employer 


CASE INSTITUTE OF TECHNOLOGY 
Case Western Reserve University 

We invite applications for tenure-track faculty 
positions at all levels. We are particularly in¬ 
terested in candidates whose research areas in¬ 
clude VLSI systems and design automation, ap¬ 
plied artificial intelligence, data bases, software 
design environments, parallel computation, and 
analysis of algorithms. Applicants will be judged 
primarily on their ability to strengthen our quest 
for excellence in research and teaching. 

CWRU is a small private university with a total 
enrollment of about 8400, of which about 5100 
are graduate and professional students. The uni¬ 
versity campus is the hub of the pleasant area 
known as University Circle, an incorporation 
with neighboring cultural centers and museums. 
University Circle is about 10 miles from down¬ 
town Cleveland. 

Case Institute of Technology, a subunit of CWRU, 
is among the top ten engineering schools in 
terms of research funding per faculty member 
and undergraduate student quality. The depart¬ 
ment of Computer Engineering and Science has 
a young faculty of 10 and growing, and a gradu¬ 
ate student body of 115 students, 48 of which are 
currently in the Ph.D. program. Department facil¬ 
ities include a DEC VAX-11/780, Data General 
MV/10000, numerous desk-top computers (Intel, 
DEC, Sun and Apollo), several color graphic 
displays (four high and ten medium resolution), 
and hard-copy equipment (color ink jet and laser 
printers, plotters, etc.). Faculty and students par¬ 
ticipating in the Center for Automation and In¬ 
telligent Systems Research have access to the 
Center's VAX-11/782 and VAXstation 100 display 
systems. Educational computing is provided by 
the Computing Center with three DEC-2060’s the 
Case Personal Computer Lab with 48 DEC PRO 
350’s, four AT&T 3B2's and several AT&T 6300’s, 
and the department’s MV/10000. 

Applicants should submit their resume and 
names of three references to: Professor J. 
Thomas Mortimer, Chairman; Department of 
Computer Engineering and Science; Case 
Western Reserve University, Cleveland, Ohio 
44106. 

An Equal Opportunity Affirmative Action Em¬ 
ployer. 
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FROSTBURG STATE UNIVERSITY 
COMPUTER SCIENCE FACULTY POSITIONS 

The Department of Computer Science is seeking 
to staff two newly-created tenure-track positions 
beginning in the Fall, 1988. Master’s degree in 
Computer Science, Information Science, or a 
related field required. Ph.D. preferred. The can¬ 
didate should have a desire and the ability to 
teach a wide range of undergraduate computer 
courses including some of, but not limited to, 
the following areas: programming languages, 
operating systems, architecture, computing 
theory, graphics, compiler construction, robot¬ 
ics, database, telecommunications, data struc¬ 
tures, artificial intelligence and expert systems, 
as well as introductory level courses. Rank and 
salary dependent on qualifications and experi¬ 
ence. Salary range $20,000-$40,000. Send a letter 
of application, curriculum vita, transcripts and at 
least three letters of recommendation by April 
15,1988, to: Mr. C. Douglas Schmidt, Director of 
Personnel Services, Frostburg State University, 
Frostburg, MD 21532. 

Women and minorities are encouraged to apply. 
Frostburg State University is an Affirmative Ac¬ 
tion/Equal Opportunity Employer. 


CLEMSON UNIVERSITY 

ELECTRICAL and COMPUTER ENGINEERING 
at CLEMSON UNIVERSITY is seeking applicants 
for tenure-track positions in Computer Engineer¬ 
ing program offering BS, MS and PhD degrees. 
All areas of computer engineering are of interest, 
with special emphasis on distributed system 
software and hardware, computer networking 
and computer architecture. Tenure-Track Posi¬ 
tions in Electrical Engineering degree program 
also available, with areas of specialization in¬ 
cluding microelectronics, reliability, robotics, 
communications, signal processing, electro¬ 
magnetics and power. Visiting Positions may 
also be available. Applicants for Tenure Track 
positions must have a PhD and a strong interest 
in teaching and research. Research Associate 
positions (BS, MS) are also available. Send 
resumes to Dr. A. Wayne Bennett, E&CE Depart¬ 
ment, Clemson University, Clemson, S.C. 
29634-0915. Clemson is an Equal Opportunity/Af¬ 
firmative Action Employer. 


THE PENNSYLVANIA STATE UNIVERSITY 
Computer Engineering 

Applications are invited for tenure-track and 
visiting faculty positions at all levels. Can¬ 
didates from all areas of computer engineering 
(hardware and software) will be considered. The 
Computer Engineering Program at The Pennsyl¬ 
vania State University is within the Department 
of Electrical Engineering which has over 50 
faculty members, and approximately 1500 under¬ 
graduate majors, 170 graduate students. Can¬ 
didates should have a Ph.D. in Electrical/Com¬ 
puter Engineering or related areas. There are 13 
faculty members within the Computer Engineer¬ 
ing Program. Excellent instruction and research 
computing facilities are available within the 
Department, College and at the University Com¬ 
puting Center. 

Please send your letter of application, resume, or 
inquiries, together with three references to: T. 
Feng, Computer Engineering Program, Depart¬ 
ment of Electrical Engineering 129 Electrical 
Engineering East, Box ES, The Pennsylvania 
State University, University Park, PA 16802. 
Deadline for applications is June 30,1988 or until 
suitable qualified candidates are selected. “An 
Equal Opportunity/Affirmative Action Employer.” 


Education Program Staff Opening: 

Senior Computer Scientist 

The Software Engineering Institute (SEI) is a 
federally funded research and development 
center operated by Carnegie Mellon University 
under contract to the Department of Defense. 
The SEI’s objective is to provide leadership in 
software engineering and in the transition of new 
software engineering technology into practice. 
The SEI Education Program has a staff opening 
for a Senior Computer Scientist to support the 
Graduate Curriculum Project, Undergraduate 
Software Engineering Education Project, and 
Academic Affiliates Program. 

General Qualifications 

Candidates for this position should have an 
understanding of education and the university 
environment we are trying to serve, as well as an 
understanding of computer science and soft¬ 
ware engineering. University teaching experi¬ 
ence is required. Candidates must have a com¬ 
mitment to high quality innovative education so 
that they, along with other senior technical staff, 
may participate in planning the overall SEI 
educational strategy. 

Responsibilities 

This staff member will have responsibility for 
coordinating the various activities through 
which the Education Program interacts with the 
educational community. These activities include: 

• Promoting the Use of SEI Educational 
Materials 

• The Education Program produces a variety 
of educational materials, including curri¬ 
culum modules, module support materials, 
books and monographs, and software tools. 
This staff member will manage the pro¬ 
cesses by which these materials are dis¬ 
seminated, including tracking their use and 
soliciting cooperation from educators in the 
creation of new materials. In particular, this 
will include managing the collection, devel¬ 
opment, and dissemination of classroom 
materials for all levels of software engineer¬ 
ing education. 

• Coordinating Academic Affiliate Activities 
This staff member will develop and execute 
plans for activities with the Academic Af¬ 
filiates of the SEI. This includes soliciting, 
reviewing, and summarizing the annual 
reports from affiliates, assisting affiliates 
with curriculum development and insertion 
efforts, and promoting summer and sabbati¬ 
cal visits to the SEI by affiliated faculty. 

• Organizing Education Program Conferences 
and Workshops 

Currently the Education Program conducts 
two Faculty Development Workshops and 
the SEI Conference on Software Engineer¬ 
ing Education annually. This staff member 
will have primary responsibility for planning 
these events, evaluating their effectiveness, 
and improving them. 

Further Information 

Contact: 

Dr. Norman E. Gibbs, Director of Education 
Software Engineering Institute 
Carnegie Mellon University 
Pittsburgh, PA 15213 
(412) 268-7703 

ArpaNet: gibbs@SEI.CMU.EDU 

The Software Engineering Institute is sponsored 
by the Department of Defense. Carnegie Mellon 
University is an equal opportunitylaffirmative ac¬ 
tion employer. U.S. citizenship or resident alien 
status is required. 


RENSSELAER POLYTECHNIC INSTITUTE 
Faculty Positions 

Department of Electrical, Computer, 
and Systems Engineering 

Applications are invited for tenure-track faculty 
positions at all levels. Areas of interest include 
but are not limited to: computer engineering, ar¬ 
chitecture and parallel processing, hardware 
design, and performance evaluation, computer 
networks, VLSI design, optical computing and 
communications, knowledge-based systems 
and artificial intelligence. The ECSE Department 
is the largest academic unit at RPI and has a rich 
tradition of research and education. The depart¬ 
ment is seeking to add faculty who bring an in¬ 
novative approach to research and teaching. Ac¬ 
tive programs in computer engineering, solid- 
state electronics and integrated circuit design, 
control systems, robotics and automation, infor¬ 
mation and decision systems, communications 
and signal processing, electronics and circuits, 
and fusion plasma systems contribute to a dy¬ 
namic research environment. In addition to the 
extensive research facilities of the department, 
there are opportunities to initiate or participate 
in interdisciplinary research programs in one of 
the major research centers of the School of Engi¬ 
neering, including the Center for Integrated Elec¬ 
tronics, Center for Interactive Computer Graphics, 
and Center for Manufacturing Productivity and 
Technology Transfer. New faculty are eligible for 
special arrangements including summer support, 
equipment, graduate student support, and re¬ 
duced teaching load in order to encourage growth 
of their research programs. Applications or re¬ 
quests for more information should be directed to: 
Dr. Arthur C. Sanderson 
Chairman, Department of Electrical, 
Computer, and Systems Engineering 
Rensselaer Polytechnic Institute 
Troy, NY 12180-3590 

RPI is an affirmative action/equal opportunity 
employer. 


UNIVERSITY OF ALBERTA 
Department of Computing Science 

Applications are invited for two tenure-track 
positions at the Assistant/Associate Professor 
level. Responsibilities include research as well 
as teaching at the graduate and undergraduate 
levels. Strong candidates from all research areas 
will be considered, but areas of special interests 
include database systems, VLSI/computer ar¬ 
chitecture, operating systems, numerical 
analysis and computer graphics. 

The Department consists of 36 academic and 28 
support staff. Current hardware support includes 
an Amdahl 5870, a network of four VAX 11/780’s 
and about thirty Sun Workstations, and well- 
equipped microcomputer and workstation labor¬ 
atories for graphics, VLSI, and Al research. 
Access to a Cyber 205 is available. 

Salary range $32,756 to $57,236 and is commen¬ 
surate with qualifications and experience. Send 
curriculum vitae and the names of three refer¬ 
ences, and up to three reprints or copies of im¬ 
portant publications. New Ph.D.'s should also in¬ 
clude a copy of their transcript. Apply to: 

Dr. Lee J. White, Chairman 
Department of Computing Science 
University of Alberta 
Edmonton, Alberta. Canada 
T6G 2H1 

Applications will be accepted until June 30, 
1988. 

The University of Alberta is committed to the 
principle of equity in employment 
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UNIVERSITY OF CINCINNATI 
Department of Electrical and 
Computer Engineering 

The Department invites applications for several 
new tenured or tenure-track faculty positions at 
all ranks. The Department has recently filled 
eight new faculty positions and is engaged in an 
exciting period of further growth. The primary 
areas of interest are artificial intelligence; neural 
networks; parallel processing; VLSI system 
design and testing; computer architecture; soft¬ 
ware engineering, languages, compilers and 
data structures; operating systems; computer 
graphics; fault tolerant computing; micro¬ 
processors; computer networks; database sys¬ 
tems and all other areas of computer engineer¬ 
ing and computer science. Candidates for senior 
positions are expected to have outstanding 
records of achievement with vigorous sponsored 
research program and ability to lead inter¬ 
disciplinary research centers. All candidates 
should have a strong commitment to excellence 
in teaching. Teaching loads are conducive to the 
establishment of high quality research pro¬ 
grams. Earned doctorate is required. Degrees in 
Computer Engineering or Computer Science are 
of special interest. Salary and benefits are ex¬ 
tremely attractive. We also seek an Associate/ 
Assistant Department Head who will play a leader¬ 
ship role in nurturing the newly established com¬ 
puter engineering degrees and research pro¬ 
grams. The Department offers BS, MS, and PhD 
degree programs in both Electrical and Com¬ 
puter Engineering, with 27 full time faculty, 105 
full time graduate students, 425 undergraduate 
students, and externally funded research of 
$1.5M annually. Applicants should send their 
resume to Professor Vik J. Kapoor, Head, Elec¬ 
trical and Computer Engineering, University of 
Cincinnati, Cincinnati, Ohio 45221-0030. Posi¬ 
tions will be available starting July 1, 1988. Ap¬ 
plications will be considered until the positions 
are filled. The University of Cincinnati is an Affir¬ 
mative Action/Equal Opportunity employer and 
encourages and welcomes applications from 
women and minorities. 


UNIVERSITY OF ROCHESTER 
Electrical Engineering 

We expect to fill several faculty positions in 
computer engineering or related areas. Current¬ 
ly, we have faculty groups active in VLSI design 
and testing, robotics, image/signal processing, 
computer system design automation, and net¬ 
working. Those appointed would carry out 
research in computer engineering and teach 
related material to graduate and undergraduate 
students. Applicants should have an established 
research record or, for very recent graduates, a 
commitment to establish such a record. These 
positions are expected to be filled at the Assis¬ 
tant Professor level. However, in exceptional 
cases a higher level of appointment may be forth¬ 
coming. Salary commensurate with experience. 
These are tenure track positions and candidates 
are expected to hold or to receive soon their Doc¬ 
torate in Electrical Engineering or Computer 
Science or a closely related field. We are par¬ 
ticularly interested in women and minority can¬ 
didates. The University of Rochester provides a 
receptive environment with excellent students 
and established research. Applicants should send 
a full resume, a statement of research plans, and 
copies of relevant publications to: Professor 
Sidney Shapiro, Chair, Dept, of Electrical 
Engineering, University of Rochester, Rochester, 
NY 14627. The University of Rochester is an Equal 
Opportunity Employer (M/F). 


FLORIDA INTERNATIONAL UNIVERSITY 
The State University of Florida at Miami 
Director—School of Computer Science 

Florida International University invites applica¬ 
tions and nominations for the position of Direc¬ 
tor of School of Computer Science. We are seek¬ 
ing an individual with excellent leadership skills, 
a distinguished record of research, administra¬ 
tive and teaching experience suitable for ap¬ 
pointment at the rank of Professor. An earned 
Doctorate in Computer Science or related field is 
required. Salary will be commensurate with 
qualifications. 

The Computer Science program was initiated in 
1972 and elevated to the status of a School 
within the College of Arts & Sciences in the Fall 
of 1987. Currently there are 20 full-time faculty 
members performing research in many areas of 
contemporary Computer Science including dis¬ 
tributed processing, database management, 
software engineering, discrete event simulation, 
computer architecture, analysis of algorithms, 
logic of computer programs, computer graphics, 
Al and expert systems. The student population 
in the school includes approximately 600 
undergraduates, 30 Master’s students and 10 
doctoral students. 

The School of Computer Science is expected to 
play a leadership role in the University's drive to 
move into the first rank of research universities 
while maintaining the quality of its excellent 
undergraduate program. It enjoys strong support 
from the University administration and will move 
into a new building in 1989. 

University computing is mainly done on a VAX 
8800 system. The resources of the school in¬ 
clude a VAX 750, a couple of microvaxes, 4 
Transputer workstations, Symbolics 3610, a 
Silicon Graphics IRIS 2400 Graphics system and 
numerous personal computers, all connected via 
ethernet. A $500,000 grant from the State has 
been instrumental in formulation of plans to 
modify the computer laboratory in the school. 
Current promised support calls for the expen¬ 
diture of $200,000 towards the purchase of 11 
Transputers—some with disks and one with 
graphics card, about a dozen Sun workstations 
and a Sun-4 file server. Substantial support for 
additional equipment has been committed by 
the university. 

FIU is the largest public university in South 
Florida. It has more than 16,500 students and 600 
full-time faculty members. It offers 167 aca¬ 
demic programs and courses at the Bachelor’s, 
Master’s and doctoral degree levels. 
Applications with names of 5 referees and 
Nominations should be sent to: 

Prof. Jai Navlakha 

Chairperson—Search and Screen Committee for 

DCS 

School of Computer Science 
University Park Campus 
Miami, Florida 33199. 

(305) 554-2026. 

BITNET: NAVLAKHA@SERVAX 
FIU is an Equal Opportunity/Equal Access/Affir¬ 
mative Action Employer. 


INTELLIGENCE NEWSLETTER 
NEURAL NETWORK NEWS 

The INTELLIGENCE newsletter has been first 
with all the news in neural networks and neuro¬ 
computing since 1984. News on scientific break¬ 
throughs, government spending, new companies 
and new products, important meetings, venture 
capital. Published monthly; $295/year. Ask about 
our free book offer. 

INTELLIGENCE 

P.O. Box 20008 

NY NY 10025-1510 

CALL 800-N EURALS/212-222-1123 


UNIVERSITY OF DELAWARE 
Department of Computer and 
Information Sciences 

Are you interested in joining the computer 
science faculty of a growing, dynamic depart¬ 
ment in an attractive university town within easy 
traveling distance to New York, Philadelphia, 
Baltimore, and Washington? The University of 
Delaware, centrally located on the East Coast, is 
recruiting for possible openings for tenure-track 
and visiting faculty positions in the Department 
of Computer and Information Sciences begin¬ 
ning September 1,1988. Strong applicants in all 
areas of computer science are encouraged to ap¬ 
ply. Special interest exists for candidates with 
research expertise in symbolic mathematical 
computation, parallel processing, artificial in¬ 
telligence, networking, graphics, programming 
languages and software engineering. 

A Ph.D. degree or its equivalent, and excellence 
in research and teaching are required. Salary and 
rank will be commensurate with the candidate's 
qualifications and experience. 

The Computer and Information Sciences Depart¬ 
ment offers bachelor, master, and doctoral 
degrees. Resources devoted to academic use in 
the University Computing Center include: an 
IBM 3081D, an CDC Cyber 174, a Vax 8600 and 
Pyramid 98xe both running Unix, and more than 
75 microcomputers (IBM PC-XT’s, AT's and 
Macintosh’s). 

The Department research facilities include vari¬ 
ous workstations (Symbolics Lisp machines, 
Micro-Vax II, SUN-3’s, and IBM-AT’s) and facili¬ 
ties in a joint research lab shared with the 
Department of Electrical Engineering. The latter 
includes a VAX-8500, three VAX 780's and vari¬ 
ous other smaller machines. The equipment is 
connected to the ARPAnet, CSNet, and to 
BITNET. 

Candidates should send a curriculum vitae and 
the names of three references to Professor 
Claudio Gutierrez, Department of Computer and 
Information Sciences, University of Delaware, 
Newark, DE 19716. Positions are open until filled. 
The University of Delaware is an equal opportuni¬ 
ty, affirmative action employer. Applications 
from members of minority groups and women 
are encouraged. 


NAVAL POSTGRADUATE SCHOOL, 
Monterey, CA 

Position Announcement in Computer Science 

The Department of Computer Science has im¬ 
mediate openings for faculty positions at all 
levels. Our primary interests are in the areas of 
operating systems, programming languages, and 
algorithms. Our secondary interests are in the 
areas of processing of visual data, real-time 
systems, and software engineering. An applicant 
should have a Ph.D. in Computer Science or a 
related field and have a strong interest in both 
graduate teaching and research. Senior appli¬ 
cants must have distinguished research records. 
Appointments can begin at any time during the 

The Department offers MS and Ph.D. degrees in 
Computer Science supported by well-equipped 
instructional/research facilities and full-time 
technical staff. The faculty normally teach for 
two quarters and conduct full-time research dur¬ 
ing the other two quarters. 

Please send a detailed resume and three letters 
of reference to: Search Committee, Computer 
Science Department (Code 52), Naval Post¬ 
graduate School, Monterey, CA 93943, tel. #(408) 
646-2449. NPS IS AN EQUAL OPPORTUNITY/AF¬ 
FIRMATIVE ACTION EMPLOYER. 
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UNIVERSITY OF WASHINGTON 

The Department of Computer Science may have 
openings for tenure-track faculty appointments 
starting in the 1988-89 academic year. We are 
particularly interested in applicants with re¬ 
search strengths in artificial intelligence, data¬ 
bases, and programming languages and compil¬ 
ers. However, applications from outstanding in¬ 
dividuals in other areas might also be considered. 
A moderate teaching load allows time for quality 
research and close involvement with students. 
We expect applicants to have a strong commit¬ 
ment to both research and teaching, and an out¬ 
standing record of research for their level. Any 
appointment should bring significant new re¬ 
search strength to the department. 

The department may also have several visiting 
positions that would require both teaching and 
research. It may be possible to hold these for 
portions of the 1988-89 academic year. 
Interested applicants should send a letter of ap¬ 
plication, a resume, and the names of four refer¬ 
ences to Paul Young, Chairman, Department of 
Computer Science FR-35, University of Washing¬ 
ton, Seattle, Washington 98195. 

The University of Washington is an Affirmative 
Action/Equal Opportunity Employer. The Ph.D. is 
required for these positions. 


CURTIN UNIVERSITY OF TECHNOLOGY 

The School of Computing and Quantitative 
Studies provides an applied computing orienta¬ 
tion to students in business information sys¬ 
tems. The programmes offered range from 
Bachelor of Business (Information Processing or 
Information Systems) to PhD in Information 
Systems. 

The School has a growing international reputa¬ 
tion in a number of aspects of Information Sys¬ 
tems namely Information Systems Planning and 
Strategy Formulation, System Development 
Methodologies, Decision Support Systems and 
End-user Computing. 

It is looking to expand its activity in these and 
related fields such as Software Engineering, 
Systems and Software Quality Assurance, Sys¬ 
tem Design Approaches, Automated Support 
Tools, Executive Information Systems and Com¬ 
mercial Exploitation of Expert Systems. 

Two Tenured Senior Lecturers (Ref 1089) are re¬ 
quired to take a leading role developing research 
and educational programmes in one or more of 
the above fields. In addition to teaching, ap¬ 
pointees will be required to engage in research/ 
scholarship; provide leadership/guidance to 
both staff and postgraduate students and moni¬ 
tor the relevance of the School's undergraduate/ 
postgraduate offerings in the light of current 
trends. 


OHIO STATE UNIVERSITY 
Faculty Positions in 
Human Factors Engineering 

Applications and nominations are invited for 
tenure track positions in the Department of In¬ 
dustrial and Systems Engineering at the Ohio 
State University. A Ph.D. degree is required. 
Preference will be given to applicants with 
strong background and research interests in 
cognitive engineering (especially interests in ar¬ 
tificial intelligence, human-computer interac¬ 
tions and/or human problem solving) or bio¬ 
mechanics. Applications from women and minori¬ 
ties are encouraged. Individuals will be expected 
to teach and develop undergraduate and gradu¬ 
ate courses in human factors and develop re¬ 
search programs which compliment and support 
existing programs. 

Nominations and applications consisting of a 
resume and the names and addresses of three 
references should be sent to: 

Human Factors Search Committee 
Department of Industrial and Systems 
Engineering 

The Ohio State University 
1971 Neil Ave. 

Columbus, Ohio 43210 


UNIVERSITY OF TENNESSEE SPACE INSTITUTE 
Graduate Research Assistants 
in Computer Science 

Applications are invited for graduate research 
assistants in computer science, particularly in 
artificial intelligence. UTSI awards M.S. and 
Ph.D. degrees in computer science with empha¬ 
sis on applied artificial intelligence, expert 
systems and robotics. 

UTSI offers total financial assistance packages 
including tuition, fees, and monthly stipends 
which average $11,600 to $12,600 for an aca¬ 
demic 9 month appointment depending upon the 
degree program. In addition, summer appoint¬ 
ments may be available. To receive an applica¬ 
tion or for further information, please contact: 
Dr. Charles Lea, Director, Admissions and Stu¬ 
dent Services, The University of Tennessee 
Space Institute, Tullahoma, TN 37388 (phone: 
(615) 455-0631, ext. 296). 

UTSI is an AA/EEO employer 


OREGON STATE UNIVERSITY 
Department of Computer Science 

The Department of Computer Science invites 
qualified applicants for two graduate fellow¬ 
ships in Computer Science at Oregon State Uni¬ 
versity. These will begin with the Fall Quarter of 
1988 and will carry a stipend of $10,000 per year. 
Recipients will be selected on a competitive 
basis, with undergraduate performance, scores 
on the Graduate Record Examination, and refer¬ 
ences as the primary source of information. We 
expect that recipients will enroll in the Ph.D. pro¬ 
gram in Computer Science at OSU and will devote 
their full time toward the pursuit of that degree. 
For further information and for application mate¬ 
rials, please contact 

Walter G. Rudd, Chairman 
Department of Computer Science 
Oregon State University 
Corvallis, Oregon 97331 
503-754-3273 

Oregon State University is an Affirmative Ac¬ 
tion/Equal Opportunities employer and complies 
with Section 504 of the Rehabilitation Act of 
1973. 


MISSISSIPPI STATE UNIVERSITY 

Mississippi State University invites applications 
for tenure-track computer science faculty posi¬ 
tions beginning August 1988 or January 1989. 
With a newly approved Ph.D. program, applica¬ 
tions for both senior and junior positions will be 
considered. A doctorate in computer science or 
computer engineering ora record of established 
research is desired. All applications, however, 
will be considered. Interested individuals should 
forward a vita and names of at least three refer¬ 
ences to: B. D. Carter, Head; Department of Com¬ 
puter Science; Drawer CS; Mississippi State, MS 
39762. 

MSU is an Affirmative Action/Equal Opportunity 
Employer. 


UNIVERSITY OF NORTH DAKOTA 

Applications are invited for an anticipated 
tenure-track faculty position in the Computer 
Science department at Assistant, Associate, or 
Professor level. Starting date is August 16,1988. 
Candidates must hold a Ph.D. in Computer Sci¬ 
ence or related field. Computer Science is one of 
the departments in the Center for Aerospace 
Sciences, a degree granting college of the 
University of North Dakota. The department of¬ 
fers an undergraduate major with 250 enrolled 
and Master's program with 15. A Ph.D. program 
is anticipated within five years. The department 
is the principal user of a VAX 11/785 and PDP 
11/44, has access to two IBM 3090’s, Concurrent 
3260 and 3280 and a VAX 780. Computer Science 
offices and laboratories are in the newest aca¬ 
demic building on campus. Salary is open and 
competitive. Benefits include TIAA-CREF and 
health and life insurance. A group disability in¬ 
surance is also available. The candidate must be 
either a U.S. citizen or eligible to work in the 
United States. Deadline to apply is April 1,1988, 
or until the position is filled. Send resume and 
three letters of reference to: Dr. Lonny Winrich, 
Department of Computer Science, Box 8181 Uni¬ 
versity Station, Grand Forks, ND 58202. UND is an 
equal opportunity affirmative action institution. 


A PhD or Master’s Degree is required together 
with a record of applied research and academic 
leadership and relevant professional experience, 
preferably at a senior level. 

Four Lecturers (two tenured, two limited term) 
(Ref-1090) are also required to teach and par¬ 
ticipate in research and educational program¬ 
mes in the following areas: 

Program Design: Knowledge of COBOL and the 
use of System Software such as screen han¬ 
dling, transaction processing and database ma¬ 
nipulation and experience in an IBM/MVS en¬ 
vironment with practical experience in on-line 
and batch data processing would be desirable. 
Systems Development: A good understanding of 
SD methods including information requirements 
determination, knowledge elicitation, data analy¬ 
sis (normalization and entity modelling), sys¬ 
tems design, specification and testing is re¬ 
quired. Experience of project management in a 
business setting would be an advantage. 
Systems Software (PCs): A broad-based knowl¬ 
edge of operating systems including hardware 
interfacing, a strong end-user orientation and a 
practical appreciation of all hardware and soft¬ 
ware aspects of personal computers is required. 
Additionally, experience in a mainframe environ¬ 
ment would be a definite advantage. 

At the Lecturer level a postgraduate qualification 
is preferred although candidates with an under¬ 
graduate qualification and extensive relevant 
professional experience will also be considered. 
Salary range: ($Aust) Senior Lecturer A$37,903 
-A$44,090; Lecturer A$28,381 -A$37,122. 

Limited Term Appointments are available initial¬ 
ly for one to three years and to a potential max¬ 
imum of five years. 

Applications: Details including names, ad¬ 
dresses and telephone numbers of three ref¬ 
erees should be submitted not later than 8 April 
1988 to the Appointments Officer, Curtin Univer¬ 
sity of Technology, GPO Box U1987, Perth, 
Western Australia, 6001. Further information 
may be obtained by Telex (AA 92983) or Fax (619) 
458 4661 quoting “Appointments", position 
reference number and your return airmail ad¬ 
dress or by telephoning in business hours 
Australia (619) 350 7064. 
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Call for Papers and Participation 


International Symposium on Databases in 
Parallel and Distributed Systems 



Symposium General Chair 
Joseph E. Urban 
University of Miami 

Co-Program Chairs 
Sushil Jajodia 
NS F 

Won Kim 
MCC 

Avi Silberschatz 
UT-Austin 

Program Committee 
Rakesh Agrawal, AT&T Bell Labs 
Francois Bancilhon, INRIA 
John Carlis, U. of Minnesota 
Doug DeGroot, Tl 
C. Ellis, Duke U. 

Shinya Fushimi, Japan 
H. Garcia-Molina, Princeton U. 
Theo Haerder, Germany 
Yahiko Kambayashi, Japan 
Gerald Karam, Carleton U. 

Michael Kifer, SUNY-Stony Brook 
Roger King, U. of Colorado 
Hank Korth, UT-Austin 
Ravi Krishnamurthy, MCC 
Duncan Lawrie, U. of Illinois 
Edward T. Lee, U. of Miami 
Eliot Moss, U. of Mass. 

Anil Nigam, IBM Yorktown Heights 
N. Roussopoulos, U of MD 
Sunil Sarin, CCA 
Y. Sagiv, Hebrew U. 

Jim Smith, ONR 
Ralph F. Wachter, ONR 
Ouri Wolfson, Technion 
Clement Yu, Ul-Chicagd 
Stan Zdonik, Brown U. 

Local Arrangement 
Hong-Tai Chou, MCC 


December 5-7,1988 
Austin, Texas 

Sponsored by: 

IEEE Computer Society Technical Committee on Data Engineering & ACM 
Special Interest Group on Computer Architecture (Approval Pending) 

In Cooperation With: 

IEEE Computer Society Technical Committee on Distributed Processing 
Symposium Objectives 

The objective of this symposium is to provide a forum for database 
researchers and practitioners to increase their awareness of the 
impacts on data models and database system architecture of parallel 
and distributed systems and new programming paradigms designed for 
parallelism. A number of general purpose parallel computers are now 
commercially available, and to better exploit their capabilities, a 
number of programming languages are currently being designed based on 
the logic, functional, and/or object-oriented paradigm. Further, 
research into homogeneous distributed databases has matured and 
resulted in a number of recently announced commercial distributed 
database systems. However, there are still major open research 
issues in heterogeneous distributed databases; the impacts of the new 
programming paradigms on data model and database system 
architecture are not well understood; and considerable research 
remairs to be done to exploit the capabilities of parallel computing 
systems for database applications. 

We invite authors to submit original technical papers describing recent 
and novel research or engineering developments in all areas relevant to 
the theme of this symposium. Topics include, but are not limited to, 

• Parallelism in data-intensive applications, both traditional (such 
as Transaction Processing) and non-traditional (such as 
Knowledge-Based) 

• Parallel computer architecture for database applications 

• Concurrent programming languages 

• Database issues in integrated database technology with the logic, 
functional, or object oriented paradigm 

• Performance, consistency, and architectural aspects of 
distributed databases 

The length of each paper should be limited to 25 double-spaced typed 
pages (or about 5000 words). Four copies of completed papers should be 
sent before May 1,1988 to: 

Won Kim, MCC, 3500 West Balcones Center Dr., Austin, TX 78759 
(512) 338-3439, kim@mcc.com 


Publicity 

Ahmed K. Elmagarmid, Penn State 

Finance Chairman 
Edward T. Lee, U. of Miami 


Papers due: May 1, 1988 

Notification of Acceptance: July 15, 1988 

Camera-ready copy due: August 30, 1988 






BOOK REVIEWS 


Editor: Wiley McKinzie, School of Computer Science and Technology, Rochester Institute of Technology, Rochester, NY 14623; Compmail, w.mckinzie; CSnet, wrm@rit 


Nations at Risk: The Impact of the Computer Revolution 


Edward Yourdon (Yourdon Press, 

New York, 1986, 616 pp., $18.95) 

Most readers are aware that the num¬ 
ber of books dealing with the impact of 
computer technology on society has 
grown exponentially over the last 
several years. Having received graduate 
degrees in both computer science and 
sociology, I have been keenly interested 
in this field and have generally been dis¬ 
appointed with the offerings. Only a 
few books—for example, Social Issues 
in Computing (Gottlieb and Borodin, 
Academic Press, 1973)—have presented 
the public with a serious treatment of 
the subject. It was, therefore, with both 
anticipation and apprehension that I 
began reading Nations at Risk. Your- 
don’s book proved to be a delightful 
addition to the short list of noteworthy 
books assessing the technological 
impact of the computer revolution. 

Yourdon addresses his remarks to the 
responsible citizen rather than the com¬ 
puter scientist. This is a book meant not 
only to inform readers but to spur them 
to take an active role in determining the 
future impact of computer technology. 

It would thus make an excellent text for 
a computer literacy or similar course. In 
addition, the book is written in an 
informal, personal style that makes it 
accessible and enjoyable to the general 
reader; Yourdon explains the technical 
terminology at a level the layperson can 
understand. Indeed, Yourdon’s ability 
to integrate his personal experiences 
into a discussion of complex technologi¬ 
cal issues increases the book’s value. 

This book is hot, however, an “ordi¬ 
nary” text on the impact of computer 
technology. As its title, inspired by the 
1983 report of the Presidential Commis¬ 
sion on Excellence in Education, sug¬ 
gests, Yourdon believes that the United 
States and the world community are 
equally at risk from a technology that is 
developing faster than our ability to 
adapt to it. Unless we attempt to under¬ 
stand and mold this technology rather 
than simply respond to it, Yourdon 
thinks we will truly be Nations at Risk. 


Thus, Yourdon seeks to provide any 
person living in the computer age with 
the necessary information to under¬ 
stand current technological develop¬ 
ments and to participate actively in 
decisions regarding them. 

Yourdon divides his discussion of 
these developments into seven broad 
sections, subdivided into 39 chapters. 
Each section discusses a major institu¬ 
tion affected by computer 
technology—government, society, 
business—or a major area of the tech¬ 
nology itself—the computer industry, 
computer software, or computer hard¬ 
ware. A final section is devoted to wider 
issues that affect not only computer 
technology but humanity in general, for 
instance, nuclear war, pollution, etc. 
Each of the topics is presented 
separately. This means the reader can 
choose to read at random about topics 
of personal interest. Unfortunately, it 
also means the book is repetitious. For 
example, Yourdon states at least five 
times that “between 50% and 80% of 
the software work done in most organi¬ 
zations is associated with revision, mod¬ 
ification, conversion,” etc., of 
previously existing programs. Such 
repetition detracts somewhat from the 
overall quality of the book, but it is 
understandable, given the author’s 
orientation. 

Perhaps a greater detraction is the 
author’s repetition of epithets. For 
example, while I might agree with Your¬ 
don’s portrayal of the Basic program¬ 
ming language as “(ugh!) BASIC,” the 
use of this epithet whenever Basic is dis¬ 
cussed becomes tiresome. Similar irrele¬ 
vant editorial comments on the 
efficiency of the post office and the 
general appearance of programmers 
occur in the text. However, these are 
only minor irritants in an otherwise 
excellent text. 

While it is impossible to summarize 
adequately a book of this breadth in a 
brief review, I would like to examine 
two of the sections. Computers and 
Society and Computer Hardware, to 
provide a flavor of Yourdon’s cover¬ 


age. The section on computers and soci¬ 
ety introduces a broad range of topics, 
including robotics, computerized home 
control, computers in the banking 
industry, the growth of a new cottage 
industry among computer professionals, 
video games, and computers and chil¬ 
dren. For each topic, Yourdon describes 
the present situation and indicates the 
short- (five years hence) and probable 
long-term trends. In addition, he sug¬ 
gests courses of action depending on the 
reader’s position within society, for 
example, what the responsible parent, 
citizen, and/or computer professional 
should do. He also includes a list of 
references for future readings. 

The chapter on computers and chil¬ 
dren illustrates the richness of Your¬ 
don’s discourse. Here he outlines 
current computer use within the school 
system, for instance, simulation, pro¬ 
gramming, drill and practice, and game 
playing, including the benefits and 
detriments from each type of use. In 
addition, he discusses problems such as 
gender bias and economic inequality in 
computer use. Advice is given to par¬ 
ents and teachers. For example, Your¬ 
don recommends drawing programs for 
younger children (third grade or below) 
and the use of Logo for older children. 

It is interesting that the drawing pro¬ 
gram he recommended for my com¬ 
puter listed for $395. While I hope I am 
a responsible parent, this seems a rather 
high price to pay for the development 
of my five-year-old daughter’s, possibly 
nonexistent, artistic talents. I suspect 
that many other parents may react in a 
similar fashion. 

Yourdon’s examination of hardware 
technology is equally comprehensive. 
For the novice, he provides a highly 
readable synopsis of major computer 
components, that is, CPU, memory, 
printers, and disks. He analyzes the cur¬ 
rent hardware growth rate and makes 
predictions concerning future growth. 

In addition, he expertly summarizes 
new technological developments, 
including parallel processing, gallium 
arsenide chips, and optical and organic 
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computers. While his analysis would 
probably not present any new informa¬ 
tion to a computer scientist or an 
engineer, it does provide the average 
person with a superb sketch of current 
developments. 

In conclusion, Nations at Risk is 
excellent reading. It is the best overall 
portrayal of computer technology that I 
have read. It is current, comprehensive, 
and personal. It will be enjoyed by citi¬ 
zen, student, and computer profes¬ 
sional. I would highly recommend it as 
a text in a computer literacy course, 
particularly if a softbound version is 
published. 

Susan Anderson-Freed 

Illinois Wesleyan University 


Software Engineering: 
Planning for Change 

David Alex Lamb (Prentice-Hall, 

Englewood Cliffs, N.J., 1988, 298 

pp., $47.00) 

This book provides a useful overview 
of important topics in software 
engineering and would be an appropri¬ 
ate textbook for computer science 
majors at designated undergraduate and 
graduate levels. Lamb emphasizes the 
consistent examination of issues related 
to programming in the large, and the 
text would be well received in a senior- 
level projects course or similar 
advanced group. 

I enjoyed the author’s realistic 
approach to software development and 
maintenance. He tells it like it is—just 
the way it happens in the messy, confus¬ 
ing, and bureaucratic world confronting 
software engineers. I often find that 
related textbooks are oriented more 
toward the cut-and-dried aspects of 
software engineering, using a typical 
teaching-style approach to focus on 
small, individual efforts for developing 
software. Fortunately, Lamb has taken 
a different approach. 

The book’s 20 chapters are divided 
among four sections—Overview, Soft¬ 
ware Lifetime, Specifications and 
Verification, and Other Topics—and 
there are six appendixes. Each chapter 
does what a good textbook should: 
starts with a catchy introduction, lays 
out the topics in a straightforward man¬ 
ner, finishes with a chapter wrap-up, 
and provides exercises, project sugges¬ 
tions, and further readings. Though I 
was pleased to see all of the essential 
ingredients, in some cases the book 


seemed thin on the readings, exercises, 
or project ideas. An instructor could get 
by with the minimum provided but 
would more than likely have to supple¬ 
ment these classroom-oriented items. 

Chapter 1 introduces the field of soft¬ 
ware engineering with a look at the 
question of “programming versus soft¬ 
ware engineering” as both title and 
occupation. A useful analogy suggests 
that a programmer is a technician, a 
software engineer is an electrical 
engineer, and a computer scientist is a 
physicist. The end of the chapter offers 
several good recommendations for 
readings (e.g., Brooks, Parnas, Glass), 
but other “greats” have been left out. 

(It turns out that the few authors named 
here are repeatedly cited throughout the 
book.) 

What can a life cycle concept offer 
software engineers? Chapter 2 looks 
into this debate. Lamb takes the posi¬ 
tion that the famous waterfall is good 
enough to help out, but he knows that 
we know (and he lets the student know) 
that all is not well with the concept. I 
liked his discussion of software tools, 
particularly his reminder that tools can 
change the behavior of developers, and 
the behavior of developers can change 
tools. Recommended reading includes a 
general reference to the ACM issues on 
Software Engineering Notes that Lamb 
should have made a bit more specific. 

Next comes a surprise chapter on 
technical writing. I often urge my stu¬ 
dents to write—not just programs, 
mind you, but prose and 
documentation—yet many books skip 
the issue of technical writing altogether. 
(Of course, students will argue that this 
is a computer class, not an English 
class.) Some books push the writing 
material all the way to the end; here we 
get it right up front. The exercises given 
at the end of the chapter are especially 
relevant and should be assigned. 

Chapter 4, on requirements analysis, 
uses a few finite-state-machine exam¬ 
ples to illustrate the major points. The 
chapter is sufficient but not a knock¬ 
out. Exercises given at the end are 
pretty good, and later chapters build on 
these problems to create progressively 
larger and larger assignments (problems 
included deal with automated calendar¬ 
ing, library-book checkout systems, 
computer board games, home budget¬ 
ing packages, and programmable calcu¬ 
lators). 

In Chapter 5, dealing with prelimi¬ 
nary design, the author introduces some 
mathematical notation to express design 
features. For the moment, the math 
doesn’t help much, but in later chapters 
the notation becomes more useful. One 
of my favorite passages in this chapter 


concerns information hiding and data 
abstraction: “Information hiding 
focuses on what to hide; abstracting 
focuses on what to reveal” (p. 48). I 
also liked the use of Pascal-ish-looking 
code to show different implementations 
of stacks (one time using a list, another 
time an array). 

Chapter 6 is on module interfaces and 
includes the first extensive use of code 
in the book. The code is PDL-like, 
Pascal-like, Ada-like and is shown in 
very brief fragments (perhaps a half 
page on the average). Obviously, stu¬ 
dents reading the book should already 
know about code and should be com¬ 
fortable looking at incomplete frag¬ 
ments and able to catch on to 
syntax/semantics on-the-fly. I found 
the code appropriate for the intended 
audience (though it’s always nice to 
have some fuller examples given). 

The next few chapters follow the 
trend started by Chapter 6. First, Chap¬ 
ter 7 examines module implementation 
issues such as namespace, exceptions, 
comments, undefined values, and so 
on. I was disappointed that no refer¬ 
ences were given here. Next, Chapter 8 
examines testing (logic, overload, tim¬ 
ing, performance, etc.). Then Chapter 9 
looks at system delivery, but the mate¬ 
rial isn’t as lively as in prior chapters 
and seems to lose the earlier realism (in 
addition, there are no references or 
exercises). Finally, Chapter 10 takes on 
the evolution of software and examines 
issues such as maintenance. 

Chapter 11 starts a new trend by 
moving into more mathematical and 
theory-based views of software 
engineering. The next several chapters 
examine (or reexamine) life cycle con¬ 
cerns with a new set of lenses. I believe 
the material would be a good introduc¬ 
tion to the topics for most students, but 
it falls short of really showing the 
power that results from the shift in 
emphasis. 

After reading 168 pages, I was ready 
for a strong show in the fifteenth chap¬ 
ter, entitled “The Workplace.” Some¬ 
how Lamb lost his bite, and the chapter 
is merely traditional. Even less exciting 
is the next chapter on scheduling (not 
another look at simple PERT!). Chap¬ 
ter 17 is on configuration management, 
and though in general it covers the 
material, I prefer such coverage to be 
more specific and to identify actual 
packages available in the marketplace. 
Chapter 18 is on a topic of increasing 
visibility and importance: software 
quality. Again the coverage is adequate 
but misses in its specificity (and no 
references are given to even guide a stu¬ 
dent toward further exploration). 

Worse still, Chapter 19, on tools, omits 
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any precise details on specific tools. 

The concluding chapter admits to 
some of the topics the book didn’t get 
around to. The appendixes include an 
example program (mentioned through¬ 
out the textbook) dealing with a small 
genealogical database. They also 
include various documentation sets and 
the source code to the single sample 
program (written in IBM Pascal/VS). A 
short glossary, bibliography, and index 
complete the book. 

Overall, I recommend the book as a 
viable candidate for use in a senior-level 
undergraduate computer science course. 
Keep in mind that the students should 
already be comfortable with program¬ 
ming and interested in expanding their 
horizons. If the Lamb book is used 
carefully, students can gain some real- 
world insight into software engineering 
along with the theory. Instructors 
would probably want the course to be 
projects oriented and would use the 
book as a departure point for discussion 
(as opposed to direct lecture material). 
Weak as it is in some respects, I found 
the book enjoyable and usable for a 
classroom setting. 

Lance B. Eliot 

University of Southern California 


IBM Personal System/2: 
A Business Perspective 

James W. Hoskins (John Wiley & 

Sons, New York, 1987, 242 pp., 

$19.95, softcover) 

Though not officially so, this is an 
IBM book through and through. The 
author is an IBM engineer, the Fore¬ 
word is by an IBM manager, and the 
book “was reviewed by more than 50 
different IBM engineers.” (To round 
out the picture, this reviewer is a retired 
IBM engineer!) It’s a comprehensive 
nontechnical review of IBM’s recently 
announced (April 1987) successor to the 
IBM PC microcomputer line, the PS/2 
family. Can we still call them “micros”? 
The PS/2 Model 80 can have up to 16M 
bytes of main memory, 230M bytes of 
hard disk storage, and 1600M bytes of 
removable-media optical storage. 

This book is not a technical reference 
manual. It is a comprehensive descrip¬ 
tion of the PS/2 Models 50/60/80 from 
a nonengineering perspective. It meets 
the needs of high-level-language 
programmers, planners, prospective 
buyers, users, and anyone else who 


wants to understand PS/2 and OS/2 
(Operating System/2) with no more 
than a glance inside the covers. It will 
be particularly satisfying to those who 
have tried to assemble a coherent pic¬ 
ture of the new family from the myriad 
IBM “letters,” brochures, and guides— 
or the reviews, overviews, and sum¬ 
maries in the trade press. It is also a 
useful primer that will give the future 
PS/2 user a running start while awaiting 
delivery. It should be comprehensible to 
the computer neophyte, but it’s really 
aimed at the experienced nontechnical 
PC user. 

Appropriately, the (non-OS/2-capable) 
Model 30 is dismissed with a few early 
paragraphs and a picture as PS/2’s 
legacy from the old PC family. The 30’s 
8086 processor and 3/^inch disks are 
noted, but the dramatically improved 
graphics are not addressed. The Model 
25 was announced too late to even be 
mentioned. 

Having an “insider” author enabled 
the book to be available very early after 
PS/2’s announcement. On the down 
side, the book preaches only the IBM 
“party line.” It espouses the virtue of 
open architecture but rarely mentions 
products other than IBM’s, except for 
briefly listing, in the compatibility 
appendix, the handful of non-IBM soft¬ 
ware products tested. This is partly 
excusable because of the newness of the 
PS/2 family and the book’s early publi¬ 
cation date. 

A seven-page table of contents pro¬ 
vides detailed access to the text, making 
the book a useful reference source. The 
index is less useful, actually only a glos¬ 
sary of terminology with pointers to the 
pertinent text pages. 

The book’s seven chapters and five 
appendixes are logically organized. It’s 
well illustrated with photos, sketches, 
and screen images. Three separate chap¬ 
ters competently treat the system over¬ 
view, options and peripherals, and 
communications. 

One chapter is primarily a detailed 
walk-through describing use of the 
PS/2 with the reference diskette that 
accompanies each machine. This dis¬ 
kette contains a comprehensive user’s 
tutorial and a handful of setup utilities. 

A nine-page chapter on application 
programs treats the subject quite super¬ 
ficially, except for pointing out the 
increased potential for powerful busi¬ 
ness graphics offered by the new VGA 
(Video Graphics Array) capability. 

The final chapter addresses, in a 
necessarily simplistic way, the choice of 
the proper PS/2 model to fit the user’s 
needs. It offers separate scenarios for 
small, medium, and large businesses, 
with proposed configurations for each. 


A 16-page appendix contains IBM’s 
publication of an outside testing lab’s 
performance tests of the various PS/2 
models versus the PC XT. The listings 
of software compatibility published by 
IBM when PS/2 was announced are 
reprinted in a 40-page appendix. These 
listings (mostly for IBM software) cover 
both PC/DOS 3.3 and OS/2 (in DOS- 
compatibility mode). The other 
appendixes—on the Model 50, other 
PS/2 publications, and peripheral 
compatibility—have rather trivial 
content. 

Overall, this is a useful book for 
those who need a not-too-detailed intro¬ 
duction to the mainstream IBM PS/2 
family (i.e.. Models 50/60/80). It won’t 
give you much insight into the Micro 
Channel, or details of the 80386 imple¬ 
mentation, and it won’t present much 
of a technical challenge. But it does 
provide authoritative information, eas¬ 
ily accessible, in one place. 

Charles B. Stott 

Charles B. Stott Associates 


Handbook of Software 
and Hardware 
Interfacing for IBM PCs 

Jeffrey P. Royer (Prentice-Hall, 

Englewood Cliffs, N.J., 1987, 254 

pp., $24.95) 

This book gives excellent information 
on the specifics needed to write good 
interface programs, as well as informa¬ 
tion to build rudimentary interface 
hardware. While its scope is limited 
strictly to IBM PCs, most of the infor¬ 
mation can be used for other members 
of the PC family, as well as most 
clones. The chapters are split between 
hardware- and software-interfacing 
topics; however, one cannot describe 
software interfacing without first 
describing the hardware. Consequently, 
Royer has devoted the first two-thirds 
of the book to software interfacing, 
accompanied by descriptions of each 
specific piece of hardware; the last third 
relates more to hardware interfacing. 

The book is targeted for somewhat 
experienced systems programmers, and 
hardware/software engineers. An over¬ 
view and basic understanding of the 
IBM PC hardware is needed to fully 
understand some of the concepts 
presented. While the IBM PC may seem 
a bit archaic as the choice of hardware, 
the information presented provides a 
good foundation for the continuing line 
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of PCs, namely the newer class of 80286 
and 80386 machines. The book’s main 
purpose is to present an interface guide 
that the engineer can refer to when 
generating a specific interface, be it 
hardware or software. Not only is the 
book an excellent text for the interface 
developer, it is a must for anyone want¬ 
ing to write low-level programs that 
require any type of input/output. It 
also does an excellent job of discussing 
most of the DOS interrupts and functions. 

The text itself is derived from a 
course taught by the author, although it 
could just as well be used as a reference 
book for the experienced interface pro¬ 
grammer. The text contains very few 
references to other sources, and when 
they do appear, they point mostly to the 
PC and the DOS technical reference 
manuals. Actually, the work is self- 
contained and requires very few refer¬ 
ences. An index is included. 

The experienced programmer will 
find that this book shines in comparison 
with others dealing with PCs, for exam¬ 
ple, Peter Norton’s Inside the IBM PC, 
(Brady Communications Company, 
1986). Royer’s text skips the basic and 
often boring fundamentals, which 
should already be familiar to readers 
involved with interfacing. It devotes its 
space instead to a discussion of the 
PC’s internal workings, even to the 
point where interfacing is just another 
type of application. 

The excellent layout provides a very 
good mix of text and white space. Large 
chapter headings and bold topic titles 
make the text extremely attractive. 
Assembly language examples are abun¬ 
dant throughout, as are pictorial 
descriptions of various topics. Each 
chapter has a short, one-paragraph 
summary to reinforce what the reader 
has learned. The logical division of the 
book into hardware and software ena¬ 
bles a reader to locate pertinent infor¬ 
mation in a matter of seconds. 

Chapters 1-3 contain introductory 
material, including a broad overview of 
the IBM PC hardware and a high-level 
look at the structure of DOS. Chapters 
4-11 describe in depth the disk operat¬ 
ing system and the basic input-output 
system. Chapters 12-18 contain infor¬ 
mation on display adapters, the key¬ 
board, CPU, I/O channel, direct 
memory access, and hardware interrupts. 

On the software side, the book con¬ 
tains a wealth of information pertaining 
to the inner workings of DOS. Unlike 
other material on the subject, where 
just one interrupt is recognized, this text 
includes all eight DOS interrupts. It is 
important for the user to know that 
even though one DOS interrupt handles 
99 percent of the traffic flow, the other 


interrupts do exist and are used. The 
author uses three chapters to explain 
disk structure, describing the file alloca¬ 
tion tables as well as the various 
methods to access and use files. One 
chapter discusses the differences 
between .EXE and .COM files and the 
structures of each. Another chapter 
describes the DOS executive functions 
used to manage memory usage and allo¬ 
cation. 

Royer devotes one chapter to installa¬ 
ble device drivers. While both character 
and block drivers are mentioned, he 
focuses on character drivers. In addi¬ 
tion to an in-depth explanation of a 
driver, Royer also gives an assembly 
language example of a driver. 

On the hardware side, Royer gives 
some insight into programming the 
8237A DMA controller chip and the 
8259A interrupt controller. Descriptions 
of the monochrome adapter and Color 
Graphics Adapter are included. How¬ 
ever, a description of the EGA adapter 
is lacking. The book could be improved 
by including the EGA adapter and 
omitting the introduction to logic gates. 
The operation of the keyboard, keyboard 
interrupt, and I/O is discussed. An 
excellent example of a “hot key” to 
activate a memory-resident program is 
given. The hardware chapters conclude 
with information on buses, bus cycles, 
I/O channels, I/O ports, and timing 
requirements for interface cards. 

Royer has succeeded at presenting the 
reader with a handbook for interfacing 
to the IBM PC and has also provided a 
good source of information for the user 
wishing to learn the inner workings of 
DOS. 

Jeff S. Ebert 

Unisys Corporation 


Microcomputer 
Hardware Design 

D.A. Protopapas (Prentice-Hall, 
Englewood Cliffs, N.J., 1988, 510 
pp., S42.67) 

There are a number of ways to pre¬ 
sent the concepts of computer architec¬ 
ture and design. One approach, often 
used in architecture classes for under¬ 
graduate computer science majors, is to 
present the computer as a tool the stu¬ 
dent must understand in order to use 
properly, but not a tool the student will 
be building. This approach provides the 
computer science major with the infor¬ 
mation needed to master the operations 
of the tool. 


Another approach, more appropri¬ 
ately applied to electrical engineering 
and computer technology majors, is to 
present computer architecture and 
design as a field to be mastered. This 
approach provides the knowledge and 
skills required to select and assemble the 
building blocks needed to create the 
tool (computer) and make it operational. 
Such is the goal of this text, and it is 
well met. 

Beginning with the organization of a 
prototypical computer, the author leads 
the reader in logical progression through 
the design process. Basic questions are 
addressed: Why should we use micro¬ 
computers? What kind of components 
are available? (Obviously just a slice 
from the evolving microcomputer tech¬ 
nology can be presented.) What are the 
typical design goals? How do we get 
from here to a full-up system? 

Microcomputer design is presented in 
a building-block fashion, with emphasis 
on standard von Neumann machines 
using a general-purpose CPU. The 
CPU, memory (both RAM and ROM), 
input/output techniques, and peripherals 
are discussed in sufficient detail to pro¬ 
vide the basic knowledge needed to pro¬ 
ceed with the design process, or to 
examine and understand the concepts 
and techniques behind a design. 

I found Chapter 3 of particular 
interest. This chapter presents, in detail, 
the design of several popular 8-, 16-, 
and 32-bit CPU chips from the Intel 
808xx and Motorola 680xx families. It 
also discusses the typical support chips 
available for the various families of 
CPUs. Having all this material presented 
in a single package and a format that is 
uniform is very valuable. 

To support the material provided in 
each chapter, the author has also 
included design examples in each area 
of interest, and each chapter contains 
an extensive set of problems that help 
reinforce and expand the reader’s 
understanding of the material. 

In the Preface the author states that 
the text is “. . . beyond the introduc¬ 
tory level . . . ,” and so it is. While no 
prior knowledge of microcomputer 
design is assumed, the reader must have 
some knowledge of computers, program¬ 
ming, computer organization, and logic 
design. The text could serve well in a 
two-semester senior- or one-semester 
graduate-level electrical engineering 
course. As a self-study text for working 
engineers, it could be improved by the 
inclusion of solutions to selected prob¬ 
lems, and an expanded reference list. 


Alan E. Gould 
Synetics Corporation 
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NEW LITERATURE 


Programming techniques. Seven high- 
level programmers provide insights into 
how they develop program code in a 
20-page booklet entitled “Editing Tech¬ 
niques of World-Class Programmers,” 
available free from Solution Systems. 
The information is presented in inter¬ 
view format. Contact Solution Systems, 
541 Main St., Suite 410, South Wey¬ 
mouth, MA 02190; (617) 337-6963 or 
(800) 821-2492. 

The Computer Graphics/Desktop Pub¬ 
lishing Directory and Buyers Guide is 
scheduled for publication in April 1988, 
with updates every six months. The 
directory is a joint venture between 
Directory House and The Oryx Press. It 
will cost $50. 

For more information, contact Direc¬ 
tory House, 27292 Calle Arroyo, San 
Juan Capistrano, CA 92675; (714) 
240-0491. 

Software legal guide revised. Nolo Press 
has revised and updated Legal Care for 
Your Software by Daniel Remer and 
Stephen Elias for the third edition 
(ISBN 0-87337-037-6, 352 pp., $29.95). 
The latest edition incorporates legal 
developments and changes in the soft¬ 
ware industry and its trade customs. 

The book covers ways to protect soft¬ 
ware and what to do if infringement 
occurs. Order from Nolo Press, 950 
Parker St., Berkeley, CA 94710; (415) 
549-1976. 

Programming with Windows (ISBN 
0-88022-299-9, Order # 99, 450 pp., 
$22.95) by Tim Farrell provides a guide 
to creating applications programs with 
Microsoft’s Windows 2.0 or 386. The 
book builds on the reader’s knowledge 
of the C programming language. A 
companion software program costs 
$39.95 (Order #211). Both are available 
from Que Corp., 11711 N. College 
Ave., Carmel, IN 46032; (317) 
573-2544. 


New ideas, creative approaches and bold leadership are 
needed for America’s process manufacturing industry to 
retain its traditional place as world leader. Honeywell and 
Arizona State University have joined forces to create a 
dynamic new graduate study program with the goal of 
developing young leaders able to meet the manufacturing 
challenges of the global marketplace of the future. 



Honeywell 


University invites applications for its 1988 Honeywell 
Industrial Fellows Program. Sponsored by the Industrial 
Automation Systems Division of Honeywell, Inc., located in 
Phoenix, this program offers an excellent opportunity for up 
to two outstanding engineers to earn a masters degree in 
computer science or computer engineering from one of the 
nation’s best engineering colleges, and, at the same time, 
gain valuable work experience at one of America’s top 
developers of process manufacturing systems. 


Terms of Awards: This program will combine a two year 
master of science in computer science degree program, 
covering the 1988-89 and 1989-90 academic school years, 
with part-time employment (20 hours pEr week) at 
Honeywell. Fellows will be employed by Arizona State 
University as graduate research assistants. A salary of $11,000,. 
plus a full tuition waiver, health insurance, and a book 
allowance will be provided for each academic school year, 
and Fellows will have the option of 1988 summer 
employment with Honeywell. 

Selection criteria: The ideal applicant will have a 
bachelors degree from an ABET-accredited institution in 
computer science or computer or electrical engineering. 
Relevant work experience is a plus, as are excellent grades. 

U S. citizenship is required. Arizona State University 
vigorously pursues affirmative action in its employment, 
activity and programs. 

Applications are due April 30. 


Superconductivity. The American Phys¬ 
ical Society has collected 112 papers on 
superconductivity published during the 
first six months of 1987 in the APS 
journals Physical Review Letters and 
Physical Review. The reprint volume 
High-Temperature Superconductivity 
(ISBN 0-88318-539-3, 400 pp., $35 non¬ 
members, $25 APS members) is avail¬ 
able from the American Physical 
Society, Publications Liaison Office, 
500 Sunnyside Blvd., Woodbury, NY 
11797; (516) 349-7800, ext. 604. 


For more information contact: 

The Honeywell Industrial Fellows Program 
Department of Computer Science 
College of Engineering and Applied Sciences 
Arizona State University 
Tempe, AZ 85287-5406 
(602) 965-3190 
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Arizona State University 
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CASE ’88 


Cooperation With: 

rf^THE COMPUTER SOCIETY 

OF THE IEEE 


Second International Workshop on 
Computer-Aided Software Engineering 

Hyatt Regency Cambridge 
Cambridge, Massachusetts 
July 12-15, 1988 

Call For Papers 


The field of Computer-Aided 
Software Engineering (CASE) has 
recently emerged as a 
commercially-viable widespread 
application of software engineering 
techniques and computer 
technology to information systems 
development. 

CASE '88 provides a forum for solid, 
detailed exchange of ideas among 
key practitioners, researchers, 
developers, and leading-edge users 
in the field, resulting in meaningful 
goals for the advancement of CASE 
technology in the next five years. 

Working groups will review the 
current state-of-the-art, discuss 
present research directions, and 
consider future requirements in the 
following topic areas: 

■ Management Issues 
—Management of CASE 
—Technology Transfer 
—Support for Top-level 

Management & Strategic Planning 

■ Methods for Information Systems 
Development 

—Integration of Methods 
—Support between Methods & Tools 
— Real-time Development Methods 
—Object-Oriented Design 

■ Enabling Technologies 
— A.I. / Expert Systems 
—Programming Language Evolution 
—Database / Knowledge Base 

Architectures 
—/Ada'"' Technology 
—User Working Environment / CHI 

m Reverse Engineering 

■ Standards for CASE 


General Chair 

Elliot Chikofsky 

Index Technology Corporation 

Program Committee 
Ronald J. Norman 

San Diego State University 

Hasan Sayani 

Advanced Systems Technology 
Corporation 

Jerrotd Grochow 

American Management Systems 

John O. Jenkins 

Imperial College 

Burt Rubenstein 

Index Technology Corporation 

Tony Wasserman 

Interactive Development 
Environments 

Thomas P. Cullinane 

Northeastern University 

Karl Lieberherr 

Northeastern University 

Bob Carasik 

Pacific Beil 

Peter Mager 

PSM Associates 

Lonnie Bentley 

Purdue University 

Priscilla Fowler 

Software Engineering Institute 

William Cureton 

Sun Microsystems 

Kate Ehrlich 

Symbolics 

Alan R. Hevner 

University of Maryland 

W. Richards Adrion 

University of Massachusetts 

Lt. Michael F. Merriman 

U.S. Air Force ESD 

Thomas Browdy 

Washington University 


Attendance is limited. Prospective 
attendees should submit 5 copies of 
a position paper (1 to 4 pages), 
discussion forum proposal, or 
extended abstract (up to 10 pages) 
on future directions of CASE 
technology related to one or more of 
the topic areas, by April 11, 1988. 

Machine-readable submissions are 
preferred. 

Program Coordinator 

Pamela Meyer 
CASE 88 

do Index Technology Corporation 
One Main Street 
Cambridge, MA 02142 

(617) 494-8200 ext. 1988 

CSNET: Chikofsky (a V AXE.COE. 

Northeastern.EDU 

European Coordinator 

John O. Jenkins 
Imperial College 
School of Management 
London SW7 2PG, United Kingdom 
01-589-5111 ext. 7112 

Exhibits Chair 

Victor Barlow 

Purdue University 

Dept, of Computer Technology 

Knoy Hall, Room 242 

West Lafayette, IN 47907 

(317) 494—4546 


Sponsored by: 

Index Technology Corporation 

In Cooperation With: 

Imperial College 
Purdue University 
Northeastern University 
Washington University CSDP 
Greater Boston Chapter ACM 











Delta FPP™ PC/AT Card $14,950 
3.1 M PEs and Wts. 

11M Connections/sec w/o learning 
2.7M Connections/sec w/BP learning 


Software: 

ANSkit™ 13 neural models 
ANSpec “ ANS high order language 

C Compiler 
Macro Assembler 


For Real World 
Artificial Neural 
Systems Applications 


ADe 

FLOATING POINT PROCESSOR 


eak) PC/AT Ci 
3 RAM on board 
Pipelined Harvard Architecture 
32 and 64 bit integer 
32 and 64 bit IEEE floating point 
C Compiler 
Macro Assembler 







